Question

I am using a TSQLConnection and TSQLDataSet to query a SQL server (2012) database from a Delphi application. All my queries thus far have worked fine, however I am now trying to write a SELECT query with an INNER JOIN, and I can't access any output from the TSQLDataSet.

The code:

Query_text:='SELECT Table1.Price
            'FROM [Table1]
              'INNER JOIN [Table2]
              'ON Table1.Code_ID = Table2.ID'   
            'WHERE (Table2.Code = '+QuotedStr(Temp_code)+')';

SQL_dataset.CommandType:=ctQuery; 
SQL_dataset.CommandText:=Query_text;
SQL_dataset.Open;  

If SQL_dataset.RecordCount>0 then .... { THIS RETURNS NOTHING }

If I input this query into SSMS then the correct information is returned. In all other SELECT queries (without the INNER JOIN) that I use, SQL_dataset returns the recordcount and fieldnames as expected.

Any ideas as to what the problem is and how to get around it?

Update:

My information on TSQLDataset.RecordCount:

http://docwiki.embarcadero.com/Libraries/XE4/en/Data.SqlExpr.TCustomSQLDataSet.RecordCount

From this I didn't get the impression it would not work with a simple query - I have used it successfully thus far with simple SELECT queries as a flag for whether the query returns any data...have I just been lucky? The link above does, however, point out that it will NOT work with parameterized queries, and multi-table joins, so that seems to explain my original problem! So thanks very much for pointing me in the right direction.

This link suggests that if both Bof and Eof are true, then the resultset is empty:

http://docwiki.embarcadero.com/Libraries/XE4/en/Data.DB.TDataSet.Eof

If SQL_dataset.Bof=True and SQL_dataset.Eof=True then  
begin 
  Found:=False;

Is this a better option?

Update 2:

Thanks for the explanation, that is starting to make sense to me. I have removed all references to RecordCount and substituted with TSQLDataset.isEmpty as suggested (I had missed that method entirely, thanks).

I had thought that as soon as you call TSQLDataset.Open that TSQLDataset.RecordCount would be populated, but if I understand correctly this is not the case?

There are occasions where I scroll through the results as follows:

SQL_dataset.CommandType:=ctQuery; 
SQL_dataset.CommandText:=Query_text;
SQL_dataset.Open;

If SQL_dataset.IsEmpty=False then 
begin
  SQL_dataset.First;

  While not SQL_dataset.Eof do  
  begin
    { DO SOMETHING }
    SQL_dataset.Next;
  end;
end;

This obviously does call TSQLDataset.Next, so I assume this then does all that memory buffering you talk about (as per RecordCount). At what point does this happen exactly?

Was it helpful?

Solution

That is code for files, like DBF and CSV, not for SQL remote dataset.

1) There is no warranty that RecordCount would contain any useful information for anything but local files. If it will - it means all the data was read fro mremote server to the local client memory. Calling RecordsCount for SQL means "i want my application to freeze for an hour until all the database content be pulled from server to the client and then crash with 'out of memory' error". Use properties .Empty, .BOF and .EOF

Actually, where did you got that RecordCount from ??? When you did read the documentation, you did saw the documentation explicitly states that RecordCount does not correspond to the number of records in the database, thus checking if RecordCount > 0 does tell nothing about database data.
http://docwiki.embarcadero.com/Libraries/XE4/en/Data.DB.TDataSet.RecordCount

2) Use parameters.

Try like this:

with SQL_dataset do begin
     Close;
     CommandType := ctQuery;
     ParamCheck := true;
     CommandText := 'SELECT Table1.Price FROM "Table1"  ' +
          'INNER JOIN "Table2" ON Table1.Code_ID = Table2.ID  '   +
          'WHERE Table2.Code = :Temp_code ';
     Params[0].AsString := 'abcdefgh';
     Open;

     if not IsEmpty then begin
....
     end;
end;

Also, please edit the tags to the question and specify Delphi version and also specify database access driver


UPDATE:

Regarding

if SQL_dataset.Bof=True and SQL_dataset.Eof=True then  
begin 
  Found:=False;

this can be written simpler.

Found := not (SQL_dataset.Bof and SQL_dataset.Eof)

or in modern Delphi

Found := not SQL_dataset.IsEmpty;

http://docwiki.embarcadero.com/Libraries/XE4/en/Data.DB.TDataSet.IsEmpty


Regarding didn't get the impression it would not work with a simple query

You cannot reliably tell simple query from complex one. SELECT * FROM XXX might be very complex query if XXX is a stored procedure or a VIEW or a table with some columns being virtual data CALCULATED by sub-queries.

Also re-read what i wrote above. Getting final number of record means that the server should execute the query to the end. And network should transfer all the data to the end. And the TDataSet should cache into memory all the received data, so that you would be able to call .Next.

Imagine a simple query over a simple table, that contains 10M rows, 1000 bytes each (like names + photos of clients) - that is totaling near 10 GB. Consider typical 100 Mb/s (approx. 10 MB/s) network. How long would it take to transfer all that data just to learn their count? And at about 2Gb transfered your 32-bit application would just die with out-of-memory error. And all that load when you actually want to know if there is at least one row or none.

UPDATE 2

AS. checking .IsEmpty right before doing while ... EOF seems a bit over-engineering to me. In those cases, when dataset would actually be empty while-loop would quit w/o entering iteration body anyway. So personally, if you do not have an else-branch with a specific code path for empty dataset, such a check before such a loop can be removed.

As for caching... That is hard to be determined for sure. Usually there is a chain: database file -> database server + query -> db access library -> TDataSet -> Grid or other consumer

At each arrow some caching may or may not happen.

Then there are uni-directional and bi-directional SQL queries/cursors. For bi-directional you can go .Next and .Prior at any rate. That is useful for grids. However for server that means that either it caches all the internal row IDs until the cursor (query) is closed, or that engine and indices naturally allow proceeding the query in both directions. I bet that natural optimization choice for data structures and algorithms for DB servers chooses former approach. At least i would not consider assuming the latter as a reliable implicit assumption.

If TDataSet is also uni-directional, or if both TDataSet and the underlying library+server are bidirectional, then TDataSet caches data in small chunks. I'd estimate it about dozen of rows. It would create extra redundant network roundtrips o fetch each row as a separate request. But it also would be unneeded burden on network (hence - delays) to fetch hundreds or thousands.

TDbGrid typically does not cache itself, instead it sets TDataSet's BufferCount (not documented, see sources or articles by Joanna Carter) to the desired number of visible rows plus some overlap for easy scrolling. However some advanced components like QuantumGrid implement their own internal caching.

So when you would put long query directly to some Grid that user would scroll to bottom, there are high chances that it would cause delay or even memory exhaustion. OTOH if your code is like that while-loop and you manage to set the query as uni-directional (the specific how to signal this belong to concrete library and dataset), that would provide all the chain with means to optimize memory consumption and caching performance. Surely "provide" does not mean that all the chain actually does implement this. And surely you would loose ability to call .Prior, .First, .Locate and such.

Those are generic observations, you have to make your own informed judgment how much influence this may cause onto your systems.

There is no silver bullet, when you works with large (or potentially large) amounts of data. That is the reason that SQL was designed to transfer as little data from server as possible, doing all the filtering on remote side. More so, WWW tide stresses the notion of sharding or clustering, when the logically unified database is split among several actual servers, to keep memory demands reasonable.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top