Question

I am scanning an SQLite database looking for all matches and using

OneFound:=False;
if tbl1.FieldByName('Name').AsString = 'jones' then
begin
  OneFound:=True;
  tbl1.Next;
end;
if OneFound then // Do something

or should I be using

if not(OneFound) then OneFound:=True;

Is it faster to just assign "True" to OneFound no matter how many times it is assigned or should I do the comparison and only change OneFuond the first time?

I know a better way would be to use FTS3, but for now I have to scan the database and the question is more on the approach to setting OneFound as many times as a match is encountered or using the compare-approach and setting it just once.

Thanks

Was it helpful?

Solution

Your question is, which is faster:

if not(OneFound) then OneFound:=True;

or

OneFound := True;

The answer is probably that the second is faster. Conditional statements involve branches which risks branch mis-prediction.

However, that line of code is trivial compared to what is around it. Running across a database one row at a time is going to be outrageously expensive. I bet that you will not be able to measure the difference between the two options because the handling of that little Boolean is simply swamped by the rest of the code. In which case choose the more readable and simpler version.

But if you care about the performance of this code you should be asking the database to do the work, as you yourself state. Write a query to perform the work.

OTHER TIPS

It would be better to change your SQL statement so that the work is done in the database. If you want to know whether there is a tuple which contains the value 'jones' in the field 'name', then a quicker query would be

with tquery.create (nil) do
 begin
  sql.add ('select name from tbl1 where name = :p1 limit 1');
  sql.params[0].asstring:= 'jones';
  open;
  onefound:= not isempty;
  close;
  free
 end;

Your syntax may vary regarding the 'limit' clause but the idea is to return only one tuple from the database which matches the 'where' statement - it doesn't matter which one.

I used a parameter to avoid problems delimiting the value.

1. Search one field

If you want to search one particular field content, using an INDEX and a SELECT will be the fastest.

   SELECT * FROM MYTABLE WHERE NAME='Jones';

Do not forget to create an INDEX on the column, first!

2. Fast reading

But if you want to search within a field, or within several fields, you may have to read and check the whole content. In this case, what will be slow will be calling FieldByName() for each data row: you should better use a local TField variable.

Or forget about TDataSet, and switch to direct access to SQLite3. In fact, using DB.pas and TDataSet requires a lot of data marshalling, so is slower than a direct access.

See e.g. DiSQLite3 or our DB classes, which are very fast, but a bit of higher level. Or you can use our ORM on top of those classes. Our classes are able to read more than 500,000 rows per second from a SQLite3 database, including JSON marshalling into objects fields.

3. FTS3/FTS4

But, as you guessed, the fastest would be indeed to use the FTS3/FTS4 feature of SQlite3.

You can think of FTS4/FTS4 as a "meta-index" or a "full-text index" on supplied blob of text. Just like google is able to find a word in millions of web pages: it does not use a regular database, but full-text indexing.

In short, you create a virtual FTS3/FTS4 table in your database, then you insert in this table the whole text of your main records in the FTS TEXT field, forcing the ID field to be the one of the original data row.

Then, you will query for some words on your FTS3/FTS4 table, which will give you the matching IDs, much faster than a regular scan.

Note that our ORM has dedicated TSQLRecordFTS3 / TSQLRecordFTS4 kind of classes for direct FTS process.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top