Pergunta

The SQLite documentation says (here) that you can avoid checkpoint pauses in WAL-mode by running the checkpoints on a separate thread. I tried this, and it doesn't appear to work: the '-wal' file grows without bound, it is unclear whether anything is actually getting copied back into the main database file, and (most important) after the -wal file has gotten big enough (over a gigabyte) the main thread starts having to wait for the checkpointer.

In my application the main thread continuously does something essentially equivalent to this, where generate_data is going to spit out order of a million rows to be inserted:

db = sqlite3.connect("database.db")
cursor = db.cursor()
cursor.execute("PRAGMA wal_autocheckpoint = 0")
for datum in generate_data():
    # It is a damned shame that there is no way to do this in one operation.
    cursor.execute("SELECT id FROM strings WHERE str = ?", (datum.text,))
    row = cursor.fetchone()
    if row is not None:
        id = row[0]
    else:
        cur.execute("INSERT INTO strings VALUES(NULL, ?)", (datum.text,))
        id = cur.lastrowid
    cursor.execute("INSERT INTO data VALUES (?, ?, ?)",
                   (id, datum.foo, datum.bar))
    batch_size += 1
    if batch_size > batch_limit:
        db.commit()
        batch_size = 0

and the checkpoint thread does this:

db = sqlite3.connect("database.db")
cursor = db.cursor()
cursor.execute("PRAGMA wal_autocheckpoint = 0")
while True:
    time.sleep(10)
    cursor.execute("PRAGMA wal_checkpoint(PASSIVE)")

(Being on separate threads, they have to have separate connections to the database, because pysqlite doesn't support sharing a connection among multiple threads.) Changing to a FULL or RESTART checkpoint does not help - then the checkpoints just fail.

How do I make this actually work? Desiderata are: 1) main thread never has to wait, 2) journal file does not grow without bound.

Foi útil?

Solução

Checkpointing needs to lock the entire database, so all other readers and writes would have to be blocked. (A passive checkpoint just aborts.)

So running checkpointing in a separate thread does not increase concurrency. (The SQLite documentation suggests this only because the main thread might no be designed to handle checkpointing at idle moments.)

If you continuously access the database, you cannot checkpoint. If your batch operations make the WAL file grow too big, you should insert explicit checkpoints into that loop (or rely on autocheckpointing).

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top