DBI::CSV Implementation Based on Sqlite

Question

Text::CSV_XS is extremely fast, using that to handle the CSV should take care of that side of the performance problem.

There should be no need for special bulk insert code to make DBD::SQLite performant. An insert statement with bind parameters is very fast. The main trick is to turn off AutoCommit in DBI and do all the inserts in a single transaction.

use v5.10;
use strict;
use warnings;
use autodie;

use Text::CSV_XS;
use DBI;

my $dbh = DBI->connect(
    "dbi:SQLite:dbname=csvtest.sqlite", "", "",
    {
        RaiseError => 1, AutoCommit => 0
    }
);

$dbh->do("DROP TABLE IF EXISTS test");

$dbh->do(<<'SQL');
CREATE TABLE test (
    name        VARCHAR,
    num1        INT,
    num2        INT,
    thing       VARCHAR,
    num3        INT,
    stuff       VARCHAR
)
SQL

# Using bind parameters avoids having to recompile the statement every time
my $sth = $dbh->prepare(<<'SQL');
INSERT INTO test
       (name, num1, num2, thing, num3, stuff)
VALUES (?,    ?,    ?,    ?,     ?,    ?    )
SQL

my $csv = Text::CSV_XS->new or die;
open my $fh, "<", "test.csv";
while(my $row = $csv->getline($fh)) {
    $sth->execute(@$row);
}
$csv->eof;
close $fh;

$sth->finish;    
$dbh->commit;

This ran through a 5.7M CSV file in 1.5 seconds on my Macbook. The file was filled with 70,000 lines of...

"foo",23,42,"waelkadjflkajdlfj aldkfjal dfjl",99,"wakljdlakfjl adfkjlakdjflakjdlfkj"

It might be possible to make it a little faster using bind columns, but in my testing it slowed things down.