Question

execute many seems to be very slow with deletion (Insertion is fine) and I was wondering if anyone knew why it took so long

consider the code below

import sqlite3

db = sqlite3.connect("mydb")
c = db.cursor()
c.execute("DROP TABLE IF EXISTS testing ")
c.execute("CREATE TABLE testing (val INTEGER);")
my_vals2 = [[x] for x in range(1,10000)]

def insertmany(vals):
    c.executemany("INSERT INTO testing (val) VALUES (?)",vals)
    db.commit()

def deletemany1(vals):
    c.executemany("DELETE FROM testing WHERE val=?",vals)
    db.commit()

def deletemany2(vals): #this is fastest even though im looping over to convert to strings and again to join ...
    vals = ["'%s'"%v[0] for v in vals] 
    c.execute("DELETE FROM testing WHERE val IN (%s)"%",".join(vals))
    #DELETE FROM TABLE WHERE x in (1,2,3...)

And The following time results (timeit was giving funny data so :/) from ipython

%time insertmany(my_vals2) 
#CPU times: user 0.60 s, sys: 0.00 s, total: 0.60 s Wall time: 0.60 s

%time deletemany1(my_vals2)
#CPU times: user 3.58 s, sys: 0.00 s, total: 3.58 s Wall time: 3.58 s

%time deletemany2(my_vals2)
#CPU times: user 0.02 s, sys: 0.00 s, total: 0.02 s Wall time: 0.02 s

And just for sake of completeness here is the timeit results (but i think timeit is broken on second test(that or the ms is a different unit then the first test))

%timeit insertmany(my_vals2) 
#1 loops, best of 3: 358 ms per loop

%timeit deletemany1(my_vals2)
#1 loops, best of 3: 8.34 ms per loop  <- this is not faster than the above!!!! (timeit lies?)

%timeit deletemany2(my_vals2)
#100 loops, best of 3: 2.3 ms per loop  

So why is executemany soooooo slow with deletes ?

Was it helpful?

Solution

SQLites stores table records in a B+ tree, sorted by rowid.

When you are inserting with an automatically generated rowid, all records are just appended at the end of the table. However, when deleting, SQLite has to search for the record first. This is slow if the id column is not indexed; either create an explicit index (as proposed by John), or declare the column as INTEGER PRIMARY KEY to make it the rowid.

Inserting with an index becomes faster if you don't use the index, i.e., if you create the index only after bulk inserts.

Your last delete command deletes all records at once. If you know that you're deleting all records in the table, you could speed it up even further by using just DELETE FROM testing, which doesn't need to look at any records at all.

OTHER TIPS

I'm just taking a punt: Because it has to search exhaustively for the ones to delete. Try it with an index and report back.

CREATE INDEX foo ON testing (val)

http://sqlite.org/lang_createindex.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top