SQL queries taking too long
-
08-07-2019 - |
Question
I'm using ADO to save data to an MS Access database. It was taking quite a while to save the data to file (about 7 seconds - which is too long for our purposes). I looked at the number of SQL queries being run, and it's about 4200; although there isn't a whole heap of data.
The database connection seems to be the bottleneck. Do you know of any way to decrease the amount of time this takes; either by somehow combining multiple statements into one to reduce overhead, or some ADO/MS-Access trick?
Can you, for instance, insert multiple rows into a table at once, and would this be noticeably faster?
Extra info:
One reason we have so many queries is that we insert a row, and then have another query to retrieve its autoincremented ID; then use this ID to insert several more rows, linking them to the first
In response to several comments and responses: I am leaving the connection open the entire time, and performing it as a single transaction with BeginTransaction() and CommitTransaciton()
Solution
Some folk have posted that @@IDENTITY
would be fast, so here’s a proof (using VBA) of how my INSERT INTO
two tables at once via a VIEW
trick is about three times faster than doing two INSERTS and grabbing the @@IDENTITY
values each time... which is hardly surprising because the latter involves three Execute
statements and the former only involves one :)
On my machine for the 4200 iterations, the VIEW
trick took 45 seconds and the @@IDENTITY
approach took 127 seconds:
Sub InitInerts()
On Error Resume Next
Kill Environ$("temp") & "\DropMe.mdb"
On Error GoTo 0
Dim cat
Set cat = CreateObject("ADOX.Catalog")
With cat
.Create _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & _
Environ$("temp") & "\DropMe.mdb"
With .ActiveConnection
Dim Sql As String
Sql = _
"CREATE TABLE TableA" & vbCr & "(" & vbCr & " ID IDENTITY NOT" & _
" NULL UNIQUE, " & vbCr & " a_col INTEGER NOT NULL" & vbCr & ")"
.Execute Sql
Sql = _
"CREATE TABLE TableB" & vbCr & "(" & vbCr & " ID INTEGER NOT" & _
" NULL UNIQUE" & vbCr & " REFERENCES TableA (ID)," & _
" " & vbCr & " b_col INTEGER NOT NULL" & vbCr & ")"
.Execute Sql
Sql = _
"CREATE VIEW TestAB" & vbCr & "(" & vbCr & " a_ID, a_col, " & vbCr & " " & _
" b_ID, b_col" & vbCr & ")" & vbCr & "AS " & vbCr & "SELECT A1.ID, A1.a_col," & _
" " & vbCr & " B1.ID, B1.b_col" & vbCr & " FROM TableA AS" & _
" A1" & vbCr & " INNER JOIN TableB AS B1" & vbCr & " " & _
" ON A1.ID = B1.ID"
.Execute Sql
End With
Set .ActiveConnection = Nothing
End With
End Sub
Sub TestInerts_VIEW()
Dim con
Set con = CreateObject("ADODB.Connection")
With con
.Open _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & _
Environ$("temp") & "\DropMe.mdb"
Dim timer As CPerformanceTimer
Set timer = New CPerformanceTimer
timer.StartTimer
Dim counter As Long
For counter = 1 To 4200
.Execute "INSERT INTO TestAB (a_col, b_col) VALUES (" & _
CStr(counter) & ", " & _
CStr(counter) & ");"
Next
Debug.Print "VIEW = " & timer.GetTimeSeconds
End With
End Sub
Sub TestInerts_IDENTITY()
Dim con
Set con = CreateObject("ADODB.Connection")
With con
.Open _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & _
Environ$("temp") & "\DropMe.mdb"
Dim timer As CPerformanceTimer
Set timer = New CPerformanceTimer
timer.StartTimer
Dim counter As Long
For counter = 1 To 4200
.Execute "INSERT INTO TableA (a_col) VALUES (" & _
CStr(counter) & ");"
Dim identity As Long
identity = .Execute("SELECT @@IDENTITY;")(0)
.Execute "INSERT INTO TableB (ID, b_col) VALUES (" & _
CStr(identity) & ", " & _
CStr(counter) & ");"
Next
Debug.Print "@@IDENTITY = " & timer.GetTimeSeconds
End With
End Sub
What this shows is the the bottleneck now is the overhead associated with executing multiple statements. What if we could do it in just one statement? Well, guess what, using my contrived example, we can. First, create a Sequence table of unique integers, being a standard SQL trick (every database should have one, IMO):
Sub InitSequence()
Dim con
Set con = CreateObject("ADODB.Connection")
With con
.Open _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & _
Environ$("temp") & "\DropMe.mdb"
Dim sql As String
sql = _
"CREATE TABLE [Sequence]" & vbCr & "(" & vbCr & " seq INTEGER NOT NULL" & _
" UNIQUE" & vbCr & ");"
.Execute sql
sql = _
"INSERT INTO [Sequence] (seq) VALUES (-1);"
.Execute sql
sql = _
"INSERT INTO [Sequence] (seq) SELECT Units.nbr + Tens.nbr" & _
" + Hundreds.nbr + Thousands.nbr AS seq FROM ( SELECT" & _
" nbr FROM ( SELECT 0 AS nbr FROM [Sequence] UNION" & _
" ALL SELECT 1 FROM [Sequence] UNION ALL SELECT 2 FROM" & _
" [Sequence] UNION ALL SELECT 3 FROM [Sequence] UNION" & _
" ALL SELECT 4 FROM [Sequence] UNION ALL SELECT 5 FROM" & _
" [Sequence] UNION ALL SELECT 6 FROM [Sequence] UNION" & _
" ALL SELECT 7 FROM [Sequence] UNION ALL SELECT 8 FROM" & _
" [Sequence] UNION ALL SELECT 9 FROM [Sequence] ) AS" & _
" Digits ) AS Units, ( SELECT nbr * 10 AS nbr FROM" & _
" ( SELECT 0 AS nbr FROM [Sequence] UNION ALL SELECT" & _
" 1 FROM [Sequence] UNION ALL SELECT 2 FROM [Sequence]" & _
" UNION ALL SELECT 3 FROM [Sequence] UNION ALL SELECT" & _
" 4 FROM [Sequence] UNION ALL SELECT 5 FROM [Sequence]" & _
" UNION ALL SELECT 6 FROM [Sequence] UNION ALL SELECT" & _
" 7 FROM [Sequence] UNION ALL SELECT 8 FROM [Sequence]" & _
" UNION ALL SELECT 9 FROM [Sequence] ) AS Digits )" & _
" AS Tens, ( SELECT nbr * 100 AS nbr FROM ( SELECT" & _
" 0 AS nbr FROM [Sequence] UNION ALL SELECT 1 FROM" & _
" [Sequence] UNION ALL SELECT 2 FROM [Sequence] UNION"
sql = sql & _
" ALL SELECT 3 FROM [Sequence] UNION ALL SELECT 4 FROM" & _
" [Sequence] UNION ALL SELECT 5 FROM [Sequence] UNION" & _
" ALL SELECT 6 FROM [Sequence] UNION ALL SELECT 7 FROM" & _
" [Sequence] UNION ALL SELECT 8 FROM [Sequence] UNION" & _
" ALL SELECT 9 FROM [Sequence] ) AS Digits ) AS Hundreds," & _
" ( SELECT nbr * 1000 AS nbr FROM ( SELECT 0 AS nbr" & _
" FROM [Sequence] UNION ALL SELECT 1 FROM [Sequence]" & _
" UNION ALL SELECT 2 FROM [Sequence] UNION ALL SELECT" & _
" 3 FROM [Sequence] UNION ALL SELECT 4 FROM [Sequence]" & _
" UNION ALL SELECT 5 FROM [Sequence] UNION ALL SELECT" & _
" 6 FROM [Sequence] UNION ALL SELECT 7 FROM [Sequence]" & _
" UNION ALL SELECT 8 FROM [Sequence] UNION ALL SELECT" & _
" 9 FROM [Sequence] ) AS Digits ) AS Thousands;"
.Execute sql
End With
End Sub
Then use the Sequence table to enumerate the values from 1 to 42000 and construct rows in a single INSERT INTO..SELECT statement:
Sub TestInerts_Sequence()
Dim con
Set con = CreateObject("ADODB.Connection")
With con
.Open _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & _
Environ$("temp") & "\DropMe.mdb"
Dim timer As CPerformanceTimer
Set timer = New CPerformanceTimer
timer.StartTimer
.Execute "INSERT INTO TestAB (a_col, b_col) " & _
"SELECT seq, seq " & _
"FROM Sequence " & _
"WHERE seq BETWEEN 1 AND 4200;"
Debug.Print "Sequence = " & timer.GetTimeSeconds
End With
End Sub
That executes on my machine in 0.2 of a second!
OTHER TIPS
More recent versions of Access support the @@IDENTITY variable. You can use this to retrieve the identity column after an insert, without doing a query.
INSERT INTO mytable (field1,field2) VALUES (val1,val2);
SELECT @@IDENTITY;
See this knowledge base article.
we insert a row, and then have another query to retrieve its autoincremented ID; then use this ID to insert several more rows, linking them to the first
Is this one table, two tables or more than two tables?
If one table, you could consider a different design e.g. you could generate your own random identifiers, use the nested sets model rather than the adjacency list model, etc. Hard to know if you won't share your design ;-)
If two tables, assuming there is a FOREIGN KEY
between the tables on the AUTOINCREMENT/IDENTITY
column, you can create a VIEW
that INNER JOIN
s the two tables and INSERT INTO
the VIEW
and the AUTOINCREMENT/IDENTITY
value will be 'copied' to the referencing table. More details and a working example in this Stack Overflow answer (link below).
If more than one table, the VIEW
trick doesn't extend beyond two tables AFAIK, so you may just have to live with the poor performance or change technology e.g. DAO is reported faster than ADO, SQL Server may be faster than ACE/Jet, etc. Again, feel free to share your design, there could be a 'think outside the box' solution.
For your situation, it may be best to use ADO to do the insert instead of running an SQL command.
for i = 1 to 100
rs.AddNew
rs("fieldOne") = "value1"
rs("fieldOne") = "value2"
rs.Update
id = rs("uniqueIdColumn")
'do stuff with id...
next
I know that using ADO is slow, but it's many times quicker than opening 4200 connections...
Some proposal (that can even be combinated):
- Why don't you get the autoincremented ID before inserting the row, so you'll have the value already available for your next query? This should spare you some time.
- The closing/opening of the connexion has also to be checked.
- Have you thought about manipulating recordsets (with Visual basic code to update your data) instead of tables (with SQL instructions sent to the database)?
- Another solution (if you are sure that the connection is the bottleneck) would be to create the table locally and then export it to the access file once the job is done.
- As you are using ADO, you can even save your data as an XML file, load a recordset from this XML file, manipulate it with VBA, and export it to an Access database
Not meaning to be a smart-arse ... But, is there a reason for continuing to use Access? SQL Server Express is free, faster and more capable ...
It sounds like you're importing data, and Access has much better facilities available for importing data. Yes, it inserts many records at a time, and it would be faster.
Can you describe the application a bit more?
The cheapest and (mostly everytime) best speed boost you can give to a database is to keep non changing data in cache.
Don't know if it can apply to you case, but it is very simple. You can just plug in a (free) library that do it or keep a local collection of items that were read. Example :
- One want to read item X.
- Does item X is in the local collection ?
- No : read item X from database and put it in the local collection.
- Yes : just return local copy of item X.
Of course, it can be a little more complicated when you are runnning a web site with a lot of server. In that case, juste use Application Blocks or Cached.
Allow me to refute the following assertions:
the SELECT @@IDENTITY suggestion -- definitely a vastly faster way to insert data and retrieve the Autonumber value than open an AddOnly recordset, updating the fields, saving the record and storing the Autonumber value. SQL INSERT will always be faster than using the row-by-row recordset.
I thought the two recordsets approach might be a bit quicker than the @@IDENTITY approach because I suspected it involves fewer database roundtrips. In my testing it was a lot quicker at 1.2 seconds, compared with 127 seconds for the @@IDENTITY approach. Here's my code (code to create the .mdb is posted in another answer):
Sub TestInerts_rs()
Dim con
Set con = CreateObject("ADODB.Connection")
con.Open _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & _
Environ$("temp") & "\DropMe.mdb"
Dim timer As CPerformanceTimer
Set timer = New CPerformanceTimer
timer.StartTimer
Dim rs1
Set rs1 = CreateObject("ADODB.Recordset")
With rs1
.ActiveConnection = con
.CursorType = 1 ' keyset
.LockType = 3 ' optimistic
.Source = "SELECT a_col, ID FROM TableA;"
.Open
End With
Dim rs2
Set rs2 = CreateObject("ADODB.Recordset")
With rs2
.ActiveConnection = con
.CursorType = 1 ' keyset
.LockType = 3 ' optimistic
.Source = "SELECT b_col, ID FROM TableB;"
.Open
End With
Dim counter As Long
For counter = 1 To 4200
rs1.AddNew "a_col", counter
Dim identity As Long
identity = rs1.Fields("ID").value
rs2.AddNew Array(0, 1), Array(counter, identity)
Next
Debug.Print "rs = " & timer.GetTimeSeconds
End Sub
Some thoughts that may or may not be helpful:
Let me second the SELECT @@IDENTITY suggestion -- definitely a vastly faster way to insert data and retrieve the Autonumber value than open an AddOnly recordset, updating the fields, saving the record and storing the Autonumber value. SQL INSERT will always be faster than using the row-by-row recordset.
using DAO transactions you might be able to convert multiple such inserts into a batch that gets executed at once. I'm not sure about this, since I don't know exactly what you're doing and how many tables you're using. The point is you'd run the series of SQL INSERTs in a transaction, and then do the .Commit at the end, such that the actual writing to the real database file will be happening as a single operation, instead of as 4200 (or however many) individual operations.