I have a long term data logging service which produces files containing one day worth of data each. I'm loading the files into an SQLite DB in a Windows forms application. The procedure to insert the data from the file into the DB includes two queries, the result of which is used in the subsequent insert.
Using SQLconnect As New SQLite.SQLiteConnection("Data Source=" & fn & ";")
SQLconnect.Open()
Using SQLcommand As SQLite.SQLiteCommand = SQLconnect.CreateCommand
Dim SqlTrans As System.Data.SQLite.SQLiteTransaction = SQLconnect.BeginTransaction
For Each Path As String In paths
fs = System.IO.File.Open(Path, IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.Read) 'Open file
Do While ReadFromStoreFile(fs, dt, Sent) = True 'Read a Timestamp/sentence pair
'Create a positions table for this MMSI if one doesn't already exist
SQLcommand.CommandText = "CREATE TABLE IF NOT EXISTS MMSI" & msg.MMSI & " (PosID INTEGER PRIMARY KEY AUTOINCREMENT, Date NUMERIC, Lat REAL, Lon REAL, Status INTEGER, SOG REAL, COG INTEGER, HDG INTEGER, VoyageID INTEGER);"
SQLcommand.ExecuteNonQuery()
Select Case msg.Type 'Dynamic position report
Case AIS.MsgType.PosRptClsA
'###THIS QUERY TAKES 20 secs per file (day) and increases 3 seconds per day!
SQLcommand.CommandText = "SELECT * FROM Voyages WHERE MMSI = " & msg.MMSI & " ORDER BY VoyageID DESC LIMIT 1" 'still the same
SQLreader = SQLcommand.ExecuteReader()
SQLreader.Read()
VID = SQLreader.Item(0)
SQLreader.Close()
SQLcommand.CommandText = "INSERT INTO MMSI" & msg.MMSI & " (Date, Lat, Lon, Status, SOG, COG, HDG, VoyageID) VALUES (" & ts & ", " & msg.Latitude & ", " & msg.Longitude & ", " & msg.NavStatus & ", " & SOG & ", " & COG & ", " & HDG & ", " & VID & ")"
SQLcommand.ExecuteNonQuery()
SQLreader.Close()
Case AIS.MsgType.StatAndVge
'Find the latest entry for this MMSI in the Voyages table
'###THIS QUERY takes 3 secs for same number of queries and does NOT increase
SQLcommand.CommandText = "SELECT * FROM Voyages WHERE MMSI = " & msg.MMSI & " ORDER BY VoyageID DESC LIMIT 1"
SQLreader = SQLcommand.ExecuteReader()
SQLreader.Read()
Dim NoVoyage As Boolean = Not (SQLreader.HasRows)
If Not NoVoyage Then
'If the data has changed, add a new entry
If Not (SQLreader.Item(2) = msg.Length) Then Changed = True
If Not (SQLreader.Item(3) = msg.Breadth) Then Changed = True
If Not (SQLreader.Item(4) = msg.Draught) Then Changed = True
If Not (SQLreader.Item(5) = msg.Destination) Then Changed = True
If Not (SQLreader.Item(6) = msg.ETA.Ticks) Then Changed = True
VoyageID = SQLreader.Item(0)
End If
SQLreader.Close()
If Changed Or NoVoyage Then
Changed = False 'reset flag
SQLcommand.CommandText = "INSERT INTO Voyages (Date, Length, Breadth, Draught, Destination, ETA, MMSI) VALUES (" & ts & ", " & msg.Length & ", " & msg.Breadth & ", " & msg.Draught & ", '" & msg.Destination.Replace("'", "''") & "', " & msg.ETA.Ticks & ", " & msg.MMSI_int & ")"
SQLcommand.ExecuteNonQuery()
SQLcommand.CommandText = "SELECT last_insert_rowid() FROM Voyages"
SQLreader = SQLcommand.ExecuteReader()
SQLreader.Read()
VoyageID = SQLreader.Item(0)
SQLreader.Close()
End If
End Select 'message type
Loop 'Read next entry from file
fs.Close() 'Close the file
'Write this file into the files table, so we know it has been written to the DB
fileinf = New System.IO.FileInfo(Path)
SQLcommand.CommandText = "INSERT OR REPLACE INTO Files (Name, Size, Date) VALUES ('" & fileinf.Name & "', '" & fileinf.Length & "', '" & fileinf.LastWriteTimeUtc.Ticks & "')"
SQLcommand.ExecuteNonQuery()
Next 'The next path in the list of paths to decode
SqlTrans.Commit() 'End of all files reached, commit all the changes to the DB
End Using 'SQLcommand
End Using 'SQLconnect
As indicated in the code, the first query is taking a very long time and (more importantly) is increasing in duration as data is loaded into the DB. When adding to 21 days of data in the DB, this query alone takes a cumulative time of around 20 secs per day and increases by about 3 secs per day for each day added. The really strange thing is that the second query (which seems identical to me) is fast (around 3 secs for the same number of queries) and is NOT increasing as more data is added.
Here is the function that creates the empty database:
Public Function CreateDB(fn As String, Force As Boolean) As Boolean
If System.IO.File.Exists(fn) Then
If Force Then
System.IO.File.Delete(fn) 'Delete the old DB and create a new one
Else
Return True 'DB alrewady exists so just return true
End If
End If
Using SQLconnect As New SQLite.SQLiteConnection
SQLconnect.ConnectionString = "Data Source=" & fn & ";"
SQLconnect.Open()
'Create Tables
Using SQLcommand As SQLite.SQLiteCommand = SQLconnect.CreateCommand
'Set page size
SQLcommand.CommandText = "PRAGMA Page_size = 4096;"
SQLcommand.ExecuteNonQuery()
'Set journalling mode to off
SQLcommand.CommandText = "PRAGMA journal_mode = OFF;"
SQLcommand.ExecuteNonQuery()
'Set auto indexing off
SQLcommand.CommandText = "PRAGMA automatic_index = false;"
SQLcommand.ExecuteNonQuery()
'Create Vessels Table
SQLcommand.CommandText = "CREATE TABLE Vessels(MMSI TEXT PRIMARY KEY, Name TEXT, Type INTEGER, IMO TEXT, CallSign TEXT, MothershipMMSI INTEGER, LastVoyageID INTEGER);"
SQLcommand.ExecuteNonQuery()
'Create Voyages Table
SQLcommand.CommandText = "CREATE TABLE Voyages(VoyageID INTEGER PRIMARY KEY AUTOINCREMENT, Date NUMERIC, Length INTEGER, Breadth INTEGER, Draught INTEGER, Destination TEXT, ETA NUMERIC, MMSI INTEGER);"
SQLcommand.ExecuteNonQuery()
'Create Meta Table
SQLcommand.CommandText = "CREATE TABLE Files(Name TEXT PRIMARY KEY, Size NUMERIC, Date NUMERIC);"
SQLcommand.ExecuteNonQuery()
End Using 'SQLcommand
End Using ' SQLconnect
Return True
End Function
What could be causing the first query to be so slow, compared to the second query, and take longer as more data is added to the DB?
SQlite and System.Data.Sqlite are the latest versions.