Scale Read Write Concurrency With SQLite WAL Mode

Personally I'm a big fan of SQLite, because it is simple and requires minimal setup vs other database engines. That being said with moderate concurrent read and write operations, you can quickly run into database is locked errors.

The reason for this is because by default SQLite uses a "rollback journal" as its journal mode, which is a mechanism that allows the database to rollback to a previous state if a transaction fails. This is roughly how it works according to the SQLite documentation - 2.1 Rollback Journals during a transaction:

It writes the new page directly to the database file (say your_database_name.db)
It creates a your_database_name.db-journal file in the same directory as your database file, and store the previous page content in it.
On the event of transaction successful commit, it deletes the your_database_name.db-journal file.
On the event of transaction failed, it restores the previous page content from the your_database_name.db-journal file to the database file.

This is a very neat way of ensuring ACID, however due to database files being frequently shuffle around, read and write operations cannot be performed concurrently.

This simplist way to fix this issue without changing the journal mode is to increase the timeout to 20 seconds plus, however this is also likely to cause slow DB queries as the RW operations are queued up, which in turn cause slow request timeouts and bad customer experience.

The alternative is to change the journal mode to WAL (Write-Ahead Logging), which is what we will be discussing in this article.

How WAL Mode Works

Similar to the rollback journal mode, WAL mode also creates a journal file called your_database_name.db-wal, but the difference is that:

Whenever a transaction commit happens, instead of directly writing the new page to the database file, it writes it to the your_database_name.db-wal file as a write-ahead log journal without writing to the database file.
On the event of a read operation, it will first read from WAL file for relevant existing pages, and then the pages from the database file.
The transactions in the WAL file will be flushed to the database file in the background with the checkpoint mechanism to ensure data consistency.

Here are some of the benefits of using WAL mode:

It significantly improve the read throughput, as the read operations are no longer blocked by the write operations.
Write operations are no longer blocked by the read operations for the same reason.
It generally improve the disk I/O throughput as there is no longer blocked by arbitrary mutual exclusion file lock from the rollback journal mode.

Some of the obvious (but possibly negligible) drawbacks are:

Larger database file size
Checkpoint process blocks the read operations but shouldn't be a problem if you flush frequently (similar to how lower filesystem dirty writeback improves performance as it eliminates chance of long IO wait).
Possibly slower read operations as the read operations need to read both WAL and database file, but personally I haven't observed any difference.

How to Enable WAL Mode

You can enable WAL mode by simly running:

sqlite3 $PATH_TO_YOUR_DB.db 'PRAGMA journal_mode=WAL;'

You can also do it in your language of choice, for example in Python:

from sqlmodel import create_engine, text

engine = create_engine(
    "sqlite:///PATH_TO_YOUR_DB.db",
    connect_args={"check_same_thread": False},
    # echo=True,
)
with engine.connect() as conn:
    conn.execute(text("PRAGMA journal_mode=WAL"))
    conn.close()

After enabling WAL mode, you can verify it by checking wheather your_database_name.db-wal and your_database_name.db-shm files are created in the same directory as your database file. Note the -shm file is the shared memory file for multiple processes to access the WAL file at the same time.

From my experience all the occasional database is locked errors go away after enabling WAL mode.

Conclusion

WAL mode makes SQLite a very viable production-grade database solution. You only needs a few lines of code to enable it, which makes it a no-brainer.

You can also archive the "serverless" experience by using the litestream to replicate the database to the cloud storage with little effort and costs. I will cover this topic in another article.