Question

I have a long term data logging service which produces files containing one day worth of data each. I'm loading the files into an SQLite DB in a Windows forms application. The procedure to insert the data from the file into the DB includes two queries, the result of which is used in the subsequent insert.

Using SQLconnect As New SQLite.SQLiteConnection("Data Source=" & fn & ";")
SQLconnect.Open()
Using SQLcommand As SQLite.SQLiteCommand = SQLconnect.CreateCommand
    Dim SqlTrans As System.Data.SQLite.SQLiteTransaction = SQLconnect.BeginTransaction
    For Each Path As String In paths
        fs = System.IO.File.Open(Path, IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.Read) 'Open file
        Do While ReadFromStoreFile(fs, dt, Sent) = True 'Read a Timestamp/sentence pair
            'Create a positions table for this MMSI if one doesn't already exist
            SQLcommand.CommandText = "CREATE TABLE IF NOT EXISTS MMSI" & msg.MMSI & " (PosID INTEGER PRIMARY KEY AUTOINCREMENT, Date NUMERIC, Lat REAL, Lon REAL, Status INTEGER, SOG REAL, COG INTEGER, HDG INTEGER, VoyageID INTEGER);"
            SQLcommand.ExecuteNonQuery()

            Select Case msg.Type 'Dynamic position report

                Case AIS.MsgType.PosRptClsA

                    '###THIS QUERY TAKES 20 secs per file (day) and increases 3 seconds per day!
                    SQLcommand.CommandText = "SELECT * FROM Voyages WHERE MMSI = " & msg.MMSI & " ORDER BY VoyageID DESC LIMIT 1" 'still the same
                    SQLreader = SQLcommand.ExecuteReader()
                    SQLreader.Read()
                    VID = SQLreader.Item(0)
                    SQLreader.Close()

                    SQLcommand.CommandText = "INSERT INTO MMSI" & msg.MMSI & " (Date, Lat, Lon, Status, SOG, COG, HDG, VoyageID) VALUES (" & ts & ", " & msg.Latitude & ", " & msg.Longitude & ", " & msg.NavStatus & ", " & SOG & ", " & COG & ", " & HDG & ", " & VID & ")"
                    SQLcommand.ExecuteNonQuery()
                    SQLreader.Close()

                Case AIS.MsgType.StatAndVge

                    'Find the latest entry for this MMSI in the Voyages table
                    '###THIS QUERY takes 3 secs for same number of queries and does NOT increase
                    SQLcommand.CommandText = "SELECT * FROM Voyages WHERE MMSI = " & msg.MMSI & " ORDER BY VoyageID DESC LIMIT 1"
                    SQLreader = SQLcommand.ExecuteReader()
                    SQLreader.Read()

                    Dim NoVoyage As Boolean = Not (SQLreader.HasRows)
                    If Not NoVoyage Then
                        'If the data has changed, add a new entry
                        If Not (SQLreader.Item(2) = msg.Length) Then Changed = True
                        If Not (SQLreader.Item(3) = msg.Breadth) Then Changed = True
                        If Not (SQLreader.Item(4) = msg.Draught) Then Changed = True
                        If Not (SQLreader.Item(5) = msg.Destination) Then Changed = True
                        If Not (SQLreader.Item(6) = msg.ETA.Ticks) Then Changed = True
                        VoyageID = SQLreader.Item(0)
                    End If
                    SQLreader.Close()

                    If Changed Or NoVoyage Then
                        Changed = False 'reset flag
                        SQLcommand.CommandText = "INSERT INTO Voyages (Date, Length, Breadth, Draught, Destination, ETA, MMSI) VALUES (" & ts & ", " & msg.Length & ", " & msg.Breadth & ", " & msg.Draught & ", '" & msg.Destination.Replace("'", "''") & "', " & msg.ETA.Ticks & ", " & msg.MMSI_int & ")"
                        SQLcommand.ExecuteNonQuery()

                        SQLcommand.CommandText = "SELECT last_insert_rowid() FROM Voyages"
                        SQLreader = SQLcommand.ExecuteReader()
                        SQLreader.Read()
                        VoyageID = SQLreader.Item(0)
                        SQLreader.Close()

                    End If
            End Select 'message type
        Loop 'Read next entry from file
        fs.Close() 'Close the file

        'Write this file into the files table, so we know it has been written to the DB
        fileinf = New System.IO.FileInfo(Path)
        SQLcommand.CommandText = "INSERT OR REPLACE INTO  Files (Name, Size, Date) VALUES ('" & fileinf.Name & "', '" & fileinf.Length & "', '" & fileinf.LastWriteTimeUtc.Ticks & "')"
        SQLcommand.ExecuteNonQuery()

    Next 'The next path in the list of paths to decode

    SqlTrans.Commit() 'End of all files reached, commit all the changes to the DB

End Using 'SQLcommand
End Using 'SQLconnect

As indicated in the code, the first query is taking a very long time and (more importantly) is increasing in duration as data is loaded into the DB. When adding to 21 days of data in the DB, this query alone takes a cumulative time of around 20 secs per day and increases by about 3 secs per day for each day added. The really strange thing is that the second query (which seems identical to me) is fast (around 3 secs for the same number of queries) and is NOT increasing as more data is added.

Here is the function that creates the empty database:

Public Function CreateDB(fn As String, Force As Boolean) As Boolean

    If System.IO.File.Exists(fn) Then
        If Force Then
            System.IO.File.Delete(fn) 'Delete the old DB and create a new one
        Else
            Return True 'DB alrewady exists so just return true
        End If
    End If


    Using SQLconnect As New SQLite.SQLiteConnection
        SQLconnect.ConnectionString = "Data Source=" & fn & ";"
        SQLconnect.Open()

        'Create Tables
        Using SQLcommand As SQLite.SQLiteCommand = SQLconnect.CreateCommand

            'Set page size
            SQLcommand.CommandText = "PRAGMA Page_size = 4096;"
            SQLcommand.ExecuteNonQuery()

            'Set journalling mode to off
            SQLcommand.CommandText = "PRAGMA journal_mode = OFF;"
            SQLcommand.ExecuteNonQuery()

            'Set auto indexing off
            SQLcommand.CommandText = "PRAGMA automatic_index = false;"
            SQLcommand.ExecuteNonQuery()

            'Create Vessels Table
            SQLcommand.CommandText = "CREATE TABLE Vessels(MMSI TEXT PRIMARY KEY, Name TEXT, Type INTEGER, IMO TEXT, CallSign TEXT, MothershipMMSI INTEGER, LastVoyageID INTEGER);"
            SQLcommand.ExecuteNonQuery()

            'Create Voyages Table
            SQLcommand.CommandText = "CREATE TABLE Voyages(VoyageID INTEGER PRIMARY KEY AUTOINCREMENT, Date NUMERIC, Length INTEGER, Breadth INTEGER, Draught INTEGER, Destination TEXT, ETA NUMERIC, MMSI INTEGER);"
            SQLcommand.ExecuteNonQuery()

            'Create Meta Table
            SQLcommand.CommandText = "CREATE TABLE Files(Name TEXT PRIMARY KEY, Size NUMERIC, Date NUMERIC);"
            SQLcommand.ExecuteNonQuery()

        End Using 'SQLcommand

    End Using ' SQLconnect 

    Return True

End Function

What could be causing the first query to be so slow, compared to the second query, and take longer as more data is added to the DB?

SQlite and System.Data.Sqlite are the latest versions.

Was it helpful?

Solution

Assuming msg is changed by ReadFromStoreFile, then the query

"SELECT * FROM Voyages WHERE MMSI = " & msg.MMSI & " ORDER BY VoyageID DESC LIMIT 1"

will be slower if there are more voyages for the given MMSI. So I assume the MMSI values that have AIS.MsgType.PosRptClsA are inserted more frequently than the other MMSIs.

It appears the query is getting the maximum voyage id for the MMSI. You could do this more directly with

"SELECT max(VoyageID) FROM Voyages WHERE MMSI = " & msg.MMSI

I don't know if this will run faster. Alternately, you could keep a dictionary of MMSI and max voyage id and update it when you do an insert into voyage to eliminate the query.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top