I have a quite large dataset (900K records, 140Mb disk space) stored in CSV file in a client app (.NET 4.0). I need to load this data to Postgres 9 db the fastest way. I use Npgsql "NpgsqlCopyIn" technique (Npgsql library version=2.1.0).
For a probe load (138K) insertion works fine - it takes about 7 secons.
But for the whole batch (900K), the code throws timeout exception:
"ERROR: 57014: canceling statement due to statement timeout"
The stack trace is:
Npgsql.NpgsqlState.d_9.MoveNext() at
Npgsql.NpgsqlState.ProcessAndDiscardBackendResponses(NpgsqlConnector
context) at Npgsql.NpgsqlCopyInState.SendCopyDone(NpgsqlConnector
context) at Npgsql.NpgsqlCopyInState.StartCopy(NpgsqlConnector
context, NpgsqlCopyFormat copyFormat) at
Npgsql.NpgsqlState.d_9.MoveNext() at
Npgsql.NpgsqlState.ProcessAndDiscardBackendResponses(NpgsqlConnector
context) at
Npgsql.NpgsqlConnector.ProcessAndDiscardBackendResponses() at
Npgsql.NpgsqlCommand.ExecuteBlind() at Npgsql.NpgsqlCopyIn.Start()
I tried setting CommandTimeout to kilo values(>7200), zero; tried same values for connection "Timeout" parameter. Also I was trying to set "CommandTimeout" via connection string, but still with no result - "ERROR 57014" comes out again and again.
Please, help to load the batch correctly!
Here is the code I use:
private static void pgBulkCopy(string connection_string, FileInfo fiDataFile)
{
using (Npgsql.NpgsqlConnection con = new Npgsql.NpgsqlConnection(connection_string))
{
con.Open();
FileStream ifs = new FileStream(fiDataFile.FullName, FileMode.Open, FileAccess.Read);
string queryString = "COPY schm.Addresses(FullAddress,lat,lon) FROM STDIN;";
NpgsqlCommand cmd = new NpgsqlCommand(queryString, con);
cmd.CommandTimeout = 7200; //7200sec, 120 min, 2 hours
NpgsqlCopyIn copyIn = new NpgsqlCopyIn(cmd, con, ifs);
try{
copyIn.Start();
copyIn.End();
}catch(Exception ex)
{
Console.WriteLine("[DB] pgBulkCopy error: " + ex.Message );
}
finally
{
con.Close();
}
}
}