Domanda

Sqlite has a limitation that it will only use one index per query. That limitation is biting me at the moment, but I need Sqlite because I'm not aware of any other local DB engine that can compete for insert speed (although I'm open to suggestions).

I have this simple table (among others) with one million to ten million rows:

CREATE TABLE [Events] (
  [Id] INTEGER PRIMARY KEY, 
  [TelemetryId] INTEGER NOT NULL, 
  [TimestampTicks] INTEGER NOT NULL, 
  [Value] TEXT NOT NULL)

Looking at my data I have about 2000 unique TelemetryId values and about 25000 rows per unique TelemetryId. I have been using this index:

CREATE INDEX [IX_Events_TimestampTicks_TelemetryId] 
  ON [Events] ([TimestampTicks], [TelemetryId])

However, that index fails me on my queries where I don't pass in a TimestampTicks constraint (obviously). That index was after I had attempted individual indexes on TimestampTicks and TelemetryId. From my testing, and even after running ANALYZE, Sqlite would only ever use the index on TelemetryId whenever that was referenced -- which is wrong in the queries where I'm restricting to a Timestamp range. If I reverse the order of the columns in my combo index, my queries that were previously fast become slow.

Here is the complete list of my queries. Can you see an indexing scheme that will work for all of them?

INSERT INTO Events (TelemetryId, TimestampTicks, Value) 
  VALUES(@TelemetryId, @TimestampTicks, @Value); SELECT last_insert_rowid()

SELECT * FROM Events e 
  INNER JOIN Telemetry ss ON ss.Id = e.TelemetryId 
  INNER JOIN Services s ON s.Id = ss.ServiceId 
  WHERE s.AssetId = @AssetId AND e.TimestampTicks >= @StartTime 
  ORDER BY e.TimestampTicks LIMIT 10000

SELECT * FROM Events e 
  WHERE e.TimestampTicks >= @StartTime 
  ORDER BY e.TimestampTicks LIMIT 10000

SELECT * FROM Events 
  WHERE TelemetryId = @TelemetryId AND TimestampTicks <= @TimestampTicks 
  ORDER BY TimestampTicks DESC LIMIT 1

SELECT MIN(TimestampTicks) FROM Events
SELECT MAX(TimestampTicks) FROM Events
SELECT COUNT(*) FROM Events

SELECT TimestampTicks, [Value] FROM Events 
  WHERE TelemetryId = @TelemetryId

SELECT Id FROM Events 
  WHERE TelemetryId = @TelemetryId LIMIT 2

SELECT MIN(e.TimestampTicks) FROM Events e 
  INNER JOIN Telemetry ss ON ss.ID = e.TelemetryID 
  INNER JOIN Services s ON s.ID = ss.ServiceID 
  WHERE s.AssetID = @AssetId

SELECT MAX(e.TimestampTicks) FROM Events e 
  INNER JOIN Telemetry ss ON ss.ID = e.TelemetryID 
  INNER JOIN Services s ON s.ID = ss.ServiceID 
  WHERE s.AssetID = @AssetId

SELECT * FROM Events 
  WHERE TimestampTicks <= @TimestampTicks AND TelemetryId = @TelemetryId 
  ORDER BY TimestampTicks DESC LIMIT 1

SELECT e.Id, e.TelemetryId, e.TimestampTicks, e.Value 
  FROM (SELECT e2.Id AS [Id], MIN(e2.TimestampTicks) as [TimestampTicks]
        FROM Events e2 WHERE e2.TimestampTicks 
            BETWEEN @Min AND @Max AND e2.TelemetryId in @TelemetryIds                                          
            GROUP BY e2.TelemetryId) AS grp
  INNER JOIN Events e ON grp.Id = e.Id
È stato utile?

Soluzione

Nobody stops you from creating multiple indexes - each index can help with certain queries.

If I were you, I would create at least following two indexes:

CREATE INDEX events_1_ix ON Events(TimestampTicks,TelemetryId);

(one that you have been using), and

CREATE INDEX events_2_ix ON Events(TelemetryId);

SQLite can make use of these indexes in following situations:

  1. Search when TimestampTicks and TelemetryId are provided (1st index)
  2. Search when TimestampTicks only is provided (also 1st index)
  3. Search when TelemetryId only is provided (2nd index)

If you only create separate indexes for TimestampTicks and TelemetryId, that will keep options 2 and 3 fast, but option 1 will become unavailable .

You can create as many indexes as you want, but remember that index maintenance does not come for free. First, it will take more disk space - it is not uncommon for an index to occupy 10%-30% of table size. So, if you create too many indexes, their total size might exceed usable table size. Also, when there are many indexes, insert or update speed can become much slower than without them.

Regarding your original statement that SQLite can use only one index per query - this is not quite correct.

Correct statement is that SQLite can only use one index per table in given query. If your SQL joins more than one table, each table can make use of an index that gives best performance to access that table.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top