문제

I am trying to analyze network traffic. The traffic is saved into an approximately 300MB-SQLite file, which I am trying to comb for keywords.

I have about 10 keywords, for which I generate mutations (reverse string, hashes, etc.), which amounts to approximately 20 variants per keyword. In order to find those permutations, I generate a query for each keyword that includes all the permutations in one command, separated by OR. One query looks like this:

SELECT * FROM flows 
WHERE 
(buffer LIKE :permutationOne) OR 
(buffer LIKE :permutationTwo) OR 
(buffer LIKE :permutationThree) OR
…
(buffer LIKE :permutationTwenty)

The initialization of the SQLite communication and the statement binding looks like that:

$sqlite = new PDO('sqlite:resources/traffic.sqlite'); // executed once

// done for each keyword
$statement = $sqlite->prepare($sqlCommand);
$statement->execute([':permutationOne' => '%perm1%', ':permutationTwo' => '%perm2%', …]);

I measured the time the execution of these ten queries takes, and it varies between 150 and 300 seconds, depending on the device the query is executed on. As the file is big and I execute 10 queries with 20 like-strings each, I was wondering whether there was a way I could optimize the query? I would very much like to at least halve the execution time, if that is possible. Or maybe I should use another library rather than PDO?

도움이 되었습니까?

해결책

OK. We decided in my other answer that an index will not help in this case. If you are stuck with sqlite, then you can use the Full-Text Search (FTS) engine. It is included with the source code, but you have to go through the trouble of re-compiling sqlite with this feature turned on.

More info:

http://answers.oreilly.com/topic/1955-how-to-use-full-text-search-in-sqlite/

다른 팁

Performing this kind of operation on the database (LIKE '%somehing%') is very very expensive. To improve performance, i could recomend you to index the field in question, or to use a full text search server like Sphinx (http://sphinxsearch.com/about/sphinx/).

EDIT:

This answer did not solve the problem, but sqlite will use a LIKE query optimization if certain conditions are true. You must not use a wildcard as the first character of your search string. I'll leave this answer here since it might help with other LIKE optimizations.

Old Answer:

I have done troubleshooting on sqlite performance before. Open up your database by typing sqlite3 databasefile. These are some of the commands I use in the sqlite3 command line:

.help
.timer ON
.explain ON -- optional
explain query plan SELECT BLAH FROM BLAH WHERE BLAH

If you see a SCAN, that's bad. If you see a SEARCH it is using an index. You can add an index to improve SELECT performance.

You could try an index like this in the sqlite3 command line:

CREATE INDEX flows_idx1 ON flows (buffer);

This will create the index as part of the databas schema, meaning you do not have to recreate it. It will henceforth exist, unless you drop it. The sqlite3 query optimizer will look at your SELECT and see if the index will help speed it up. You do not need to change your SELECT query at all.

See also the Sqlite optimization documentation.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top