Web Programming
Lecture 11

Data persistence in web applications with Node.js

Josue Obregon

Seoul National University of Science and Technology
Information Technology Management
Lecture slides index

May 20, 2025

Agenda

  • SQL basics
  • sqlite module for Node.js

Course structure

Roadmaps: Frontend, backend and fullstack

Persistence of data

  • Persistence is “the continuance of an effect after its cause is removed.”
  • Persistent data is stored on a long-lasting storage medium, and data remains intact after modifications or changes to the storage medium.
  • This is the important distinction between persistent data and non-persistent data: the data survives after the process ends.
  • In other words, for data to be stored persistently, it must write to non-volatile storage.
  • Data persistence ensures that valuable business information remains accessible in a database and consistent across sessions, devices, and applications.
  • Data stored could be:
    • transactional data (dynamic data) on ecommerce platforms
    • customer records (sensitive data)
    • user-generated content
  • The point of saving data in a non-volatile storage system is to make sure it can be reliably retrieved later.

Is File-Processing the Best Way?

  • We have seen how to store JSON data in a file
  • On the server, we have a lot of functionality to access the file system
  • JSON files are useful for simple, temporary storage
  • But they lack structure, concurrency control, and scalability
    • Processing files can get a bit tedious and it’s easy to accidentally overwrite data.

Example - Airlines and flights data

  • Consider an airline application that manages flights, where each flight is structured like what’s shown on the right
  • Each flight entry records its origin, destination, and duration (in minutes).
  • This kind of structured data is essential for applications that need to persist flight schedules, search routes, or calculate travel times.
origin destination duration
New York London 415
Shanghai Paris 760
Istanbul Tokyo 700
New York Paris 435
Moscow Paris 245
Lima New York 455

Can we use JSON files for this application?

  • When working with file-based JSON, things get messy as the application grows.
  • What If We Want to Filter Our Data?
    • All flights from New York
    • The shortest flight
    • Flights that arrived in Paris
    • Flights last week
    • Flights longer than 5 hours
    • etc…
  • Question: How might you write the code to filter flights in all these different ways using plain files?
    • Loading the entire file every time
    • Parsing and iterating manually
    • Writing custom logic for every case
    • Risking data loss when writing back
  • A database allows us to do this with a simple query!

flights.json

{
  "flights": [
    {
      "origin": "New York",
      "destination": "London",
      "duration": 415
    },
    {
      "origin": "Shanghai",
      "destination": "Paris",
      "duration": 760
    },
    {
      "origin": "Istanbul",
      "destination": "Tokyo",
      "duration": 700
    }
  ]
}

What is a Database

  • A database is an electronically stored, systematic collection of data.
  • It can contain any type of data, including words, numbers, images, videos, and files.
  • You can use software called a database management system (DBMS) to store, retrieve, and edit data.
  • Databases allow:
    • Querying data efficiently
    • Handling multiple users safely
    • Enforcing data types and constraints
    • Persisting and retrieving data reliably
  • Ideal for applications with growing data and users
  • What are some examples of data you could store in a database?
    • Pokedex, Book Reviews, Store Inventory, User Information, Airlines and flight information

Database Software

  • A database management system (DBMS) is a software system for creating and managing databases.
  • A DBMS enables end users to create, protect, read, update and delete data in a database.
  • It also manages security, data integrity and concurrency for databases.

Popular DBMSs

Types of Databases

Types of databases:

  • Hierarchical: Organizes data in a tree-like structure with parent-child relationships.
  • Relational: Stores data in tables with rows and columns, using keys to relate them
  • Object-oriented: Store data as objects, like in object-oriented programming
  • NoSQL:Handle unstructured or semi-structured data, using flexible formats like JSON or key-value pairs.

Structured Query Language (SQL)

  • A “domain-specific language” (HTML is also a DSL) designed specifically for data access and management.
  • SQL is a declarative language: describes what data you are seeking, not exactly how to find it.
    • HTML: markup language, JavaScript: interpreted imperative language
  • In SQL, you write statements. The main different types of statements we’ll look at:
    • Data Definition Language (DDL): is used to create, modify, or destroy objects within an RDBMS
    • Data Manipulation Language (DML): is the domain of INSERT, UPDATE, and DELETE, which you use to manipulate data.

SQLite data types

  • NULL: The value is a NULL value. Like in JavaScript, stands for the absence of value
  • INTEGER: The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
  • REAL: The value is a floating-point value, stored as an 8-byte IEEE floating point number.
  • TEXT: The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).
  • BLOB: The value is a blob of data, stored exactly as it was input (Binary Large Object, e.g., audio and video)

CREATE TABLE

  • CREATE TABLE is used to create a new table.
CREATE TABLE flights (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    origin TEXT NOT NULL,
    destination TEXT NOT NULL,
    duration INTEGER NOT NULL
);
  • Constraints - they “constrain” the types of values you can insert in a column.
    • PRIMARY KEY (keyname): Used to specify a column or group of columns uniquely identifies a row in a table.
      • Every table should have a column which is used to uniquely identify each row.
      • This improves efficiency and will prove very useful when using multiple tables.
    • AUTOINCREMENT: Used with an integer primary key column to automatically generate the “next” value for the key.
      • In SQLite, only available for a primary key that’s an INTEGER.
    • NOT NULL: prevents NULL entries in a column, requires the value to be set in INSERT statements.
    • DEFAULT: specifies default values for a column if not provided in an INSERT statement

INSERT INTO TABLE

  • To insert a new record into a table, we use the INSERT INTO keyword
INSERT INTO flights
    (origin, destination, duration)
    VALUES ("New York", "London", 415);
  • First provide the table name, then optionally the list of columns you want to set (by default it sets all columns).
  • Columns left out will be set to NULL, unless they have AUTOINCREMENT set or some DEFAULT value set.
  • Then provide the values for each column, which must match the column names specified.

SELECT statement

  • The SELECT statement is used to return data from a database.
  • It returns the data in a result table containing the row data for column name(s) given.
  • Table and column names are case-sensitive.
SELECT * FROM flights;

SELECT some columns

SELECT origin, destination FROM flights;

SELECT rows using the WHERE clause

  • The WHERE clause filters out rows based on their columns’ data values.
  • The WHERE portion of a SELECT statement can use the following operators:
    • =, >, >=, <, <=
    • <> or != (not equal)
    • LIKE pattern
    • IN (value, value, …, value)
SELECT * FROM flights WHERE id = 3;

SELECT rows based on a text column

SELECT * FROM flights WHERE origin = "New York";

SELECT a few rows based on numerical column

SELECT * FROM flights WHERE duration > 500;

SELECT rows based on a conjunction of conditions

SELECT * FROM flights WHERE duration > 500 AND destination = "Paris";

SELECT rows based on a disjunction of conditions

SELECT * FROM flights WHERE duration > 500 OR destination = "Paris";

SELECT rows using the IN operator

  • The IN operator allows you to specify multiple values in a WHERE clause.
  • The IN operator is a shorthand for multiple OR condition
SELECT * FROM flights WHERE origin IN ("New York", "Lima");

SELECT rows using the LIKE operator

  • The SQL LIKE operator is used for performing pattern-based searches in a database.
  • There are two wildcards often used in conjunction with the LIKE operator:
    • The percent sign %represents zero, one, or multiple characters
    • The underscore sign _ represents one, single character
SELECT * FROM flights WHERE origin LIKE "%a%";

Grouping functions

  • AVERAGE
  • COUNT
  • MAX
  • MIN
  • SUM …

UPDATE

  • To update an existing record in a table, we use the UPDATE and SET keywords.
UPDATE flights
SET duration = 430
WHERE origin = "New York" AND destination = "London";

DELETE

  • To delete a record from a table, we use the DELETE keyword:
DELETE FROM flights WHERE destination = "Tokyo";

Warning

  • What happens if we forget to add a WHERE clause?

Other clauses

  • LIMIT
  • ORDER BY
  • GROUP BY
  • HAVING

Foreign keys

  • A foreign key is a column in one table that links to the primary key in another table, creating a relationship between the two tables and ensuring data consistency.
  • Let’s go back to our flights table
id origin destination duration
1 New York London 415
2 Shanghai Paris 760
3 Istanbul Tokyo 700
4 New York Paris 435
5 Moscow Paris 245
6 Lima New York 455

Adding the airport code to flights table

id origin origin_code destination destination_code duration
1 New York JFK London LHR 415
2 Shanghai PVG Paris CDG 760
3 Istanbul IST Tokyo NRT 700
4 New York JFK Paris CDG 435
5 Moscow SVO Paris CDG 245
6 Lima LIM New York JFK 455

Airports table

id code city
1 JFK New York
2 PVG Shanghai
3 IST Istanbul
4 LHR London
5 SVO Moscow
6 LIM Lima
7 CDG Paris
8 NRT Tokyo

Old flights table

id origin destination duration
1 New York London 415
2 Shanghai Paris 760
3 Istanbul Tokyo 700
4 New York Paris 435
5 Moscow Paris 245
6 Lima New York 455

Normalized flight table

  • A foreign key is a column in one table that links to the primary key in another table, creating a relationship between the two tables and ensuring data consistency.
id origin_id destination_id duration
1 1 4 415
2 2 7 760
3 3 8 700
4 1 7 435
5 5 7 245
6 6 1 455

Flights table with foregin keys

  • Now we also want to store the information of passengers
id first last flight_id
1 Harry Potter 1
2 Ron Weasley 1
3 Hermione Granger 2
4 Draco Malfoy 4
5 Luna Lovegood 6
6 Ginny Weasley 6

People table

id first last
1 Harry Potter
2 Ron Weasley
3 Hermione Granger
4 Draco Malfoy
5 Luna Lovegood
6 Ginny Weasley

Normalized passenger table

person_id flight_id
1 1
2 1
2 4
3 2
4 4
5 6
6 6

How can we use multiple tables in one SQL query?

  • When you have relationships defined with FOREIGN KEY/PRIMARY KEY, it is often useful to reference both tables to combine data (e.g. displaying the airport info for a flight, or the passenger info for a flight)
  • We can reference multiple tables either with an additional WHERE constraint or the JOIN keyword.

Multi table WHERE syntax

SELECT people.first, flights.origin, flights.destination
FROM people, flights, passengers
WHERE people.id = passengers.person_id
AND flights.id = passengers.flight_id;
  • Or we can use joins…

JOIN

SELECT people.first, flights.origin, flights.destination
FROM passengers
JOIN people ON passengers.person_id = people.id
JOIN flights ON passengers.flight_id = flights.id;
first origin destination
Harry New York London
Ron New York London
Hermione Shanghai Paris
Draco New York Paris
Luna Lima New York
Ginny Lima New York

Types of JOINs

  • JOIN / INNER JOIN
  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN
  • FULL OUTER JOIN

Intersection of tables is denoted by:

ON tableA.id = tableB.id;
  • Not all of these joins are supported by SQLITE.
  • For example, full outer join and right joins are not currently supported.

SQL Injection

  • An SQL injection is a security vulnerability where attackers insert malicious SQL code into a query input.
  • It can allow attackers to bypass authentication, steal, modify, or delete data in the database.
  • SQL injection happens when user input is not properly validated or directly included in SQL statements.
  • Let’s say we have the following form in a webpage

  • And we have the following SQL query to retrieve the information about a user
SELECT * FROM users
WHERE username = username AND password = password;

Retrieving harry’s information

  • Harry login introducing his information

  • Inserting harry’s credentials in our previous query results in the following query:
SELECT * FROM users
WHERE username = "harry" AND password = "12345";

SQL injection in action

  • Consider the situation where a malicious user enters this as the user input: ‘hacker”–’

  • Inserting the hacker’s credentials in our previous query results in the following query:
-- this is a comment in SQL
SELECT * FROM users
WHERE username = "hacker"--" AND password = "";

Data layer of a web application with Node.js

  • sqlite vs sqlite3
  • sqlite3 is the official SQLite bindings for Node.js.
  • sqlite3 is the most popular module used to connect to a SQLite database in Node.js
  • Similar to the fs module, sqlite3 functions by default use callbacks for results and errors
  • Callbacks?
  • We will use sqlite which is a promise-based wrapper built on top of sqlite3.

Using SQL in Node.js

  • First install the sqlite3 and sqlite modules in your project.
npm install sqlite sqlite3
  • Second, require both modules with the rest of your modules in your Node.js program.
const sqlite3 = require('sqlite3');
const sqlite = require('sqlite');

SQL connection

const sqlite3 = require('sqlite3');
const sqlite = require('sqlite');
const fs = require('fs');

const DB_NAME = "airline.db";
/**
 * Establishes a database connection to the database and returns the database object.
 * Any errors that occur should be caught in the function that calls this one.
 * @returns {Promise<sqlite3.Database>} - The database object for the connection.
 */
async function getDBConnection(dbFileName) {

    const db = await sqlite.open({
        filename: dbFileName, 
        driver: sqlite3.Database
    });
    console.log("Connection succesful.")
    return db;
}
  • If you run getDBConnection function, you will see a file called airline.db created in your root folder.
  • SQLite is a file based, single disk file (read more here)

Methods for Querying the Database

  • .all()
    • execute an SQL query, returns a Promise which resolves to an array of objects with all the resulting rows
  • .exec()
    • Executes a SQL query, returns a Promise which resolves to undefined (nothing is returned by this method)
  • .get()
    • executes an SQL query, returns a Promise which resolves to an object of only the first resulting row
  • .run()
    • executes a SQL query, returns a Promise which resolves to an object with metadata but no results/data from the query
  • Establishing a connection using our getDbConnection function
    • Inside an async function
    let db = await getDbConnection(DB_NAME); //just for testing
    • Using .then/.catch syntax
    getDBConnection(DB_NAME); //just for testing

Loading our dataset

  • The following code creates the flights table and insert some files.
    • You can get the flights_schema.sql file here
async function loadDB(dbFileName, sqlFileName) {
    try {
        let db = await getDBConnection(dbFileName);
        const sql = fs.readFileSync(sqlFileName, 'utf8');
        const result = await db.exec(sql);

        console.log('Database loaded successfully.');
        console.log(result);

        await db.close();
    } catch (err) {
        console.error('Failed to load database:', err.message);
    }
}

loadDB(DB_NAME, "flights_schema.sql");

db.close()

  • The close function closes the db connection and ensures that its associated resources are deallocated.
  • If we don’t close our connections it’s possible we could consume all the memory available to our program, which would cause it to crash.
  • This is known as a memory leak.
  • .close() returns a Promise, so it must be awaited

Executing SQL queries with the db object

  • Once you have some data and your db object, you can now execute SQL queries with db.all
  • This function takes a SQL query string and an optional array of parameters/placeholder values and returns the resulting rows.
async function executeQuery() {
    try {
        let db = await getDBConnection(DB_NAME);

        const sql = "SELECT * FROM flights;"
        let result = await db.all(sql);
        console.log(result);
        await db.close();
    } catch (err) {
        console.error('Failed to load database:', err.message);
    }
}

executeQuery();

Using db.exec(sqlString)

  • Executes a SQL query, returns a Promise which resolves to undefined (nothing is returned by this method)
async function executeInsert() {
    try {
        let db = await getDBConnection(DB_NAME);

        const sql = "INSERT INTO flights (origin, destination, duration) VALUES (\"Incheon\", \"Monterrey\", 960);"
        let result = await db.exec(sql);
        console.log(result);
        await db.close();
    } catch (err) {
        console.error('Failed to load database:', err.message);
    }
}

executeInsert();

Using db.run(sqlString)

  • Executes an SQL query, returns a Promise which resolves to an object with metadata but no results/data from the query
async function runInsert() {
    try {
        let db = await getDBConnection(DB_NAME);

        const sql = "INSERT INTO flights (origin, destination, duration) VALUES (\"Guatemala\", \"Dallas\", 180);"
        let result = await db.run(sql);
        console.log(result);
        await db.close();
    } catch (err) {
        console.error('Failed to load database:', err.message);
    }
}

runInsert();

Extracting the data

  • The column (field) names for each row (record) can be accessed using dot notation (it’s just an object!)
  • Note that only the column names specified in the SELECT statement will be accessible
    • In this case we selected all collumns with the wildcard *
async function extractData() {
    try {
        let db = await getDBConnection(DB_NAME);

        const sql = "SELECT * FROM flights;"
        let result = await db.all(sql);
        result.forEach(flight => {
            console.log(`Flight ${flight.id}, from ${flight.origin} to ${flight.destination} takes ${flight.duration} minutes.`)
        });
        await db.close();
    } catch (err) {
        console.error('Failed to load database:', err.message);
    }
}

extractData();

Practice yourself

  • Insert three new flights into the flights table
    • From Incheon to San Francisco, Incheon to Amsterdam, Paris to Rome
  • Query all flights longer than 600 minutes
  • Find all flights arriving in Paris
  • List flights originating from New York, Seoul, or Paris
  • Create an API endpoint /flights that returns a JSON listing all the flights

Next week

  • Develop airline webapp and user authentication

Acknowledgements


Back to title slide Back to lecture slides index