Question

Below is a code that dumps an SQL query to an aligned text table format.

sub sql_command {
    my ($self,$str) = @_;
    my $s = $self->{_db}->prepare($str) or die $!;
    $s->execute() or die $!;
    my $table;
    push @$table, [ map { defined $_ ? $_ : "undef" } @{$s->{'NAME'}}];
    while(my $row = $s->fetch) {
          push @$table, [ map{ defined $_ ? $_ : "undef" }@$row ];
    }
    return box_format($table);;
}


sub box_format {
    my $table = shift;
    my $n_cols = scalar @{$table->[0]};

    my $tb = Text::Table->new(\'| ', '', (\' | ','')x($n_cols-1), \' |+');
    $tb->load(@$table);
    my $rule = $tb->rule(qw/- +/);
    my @rows = $tb->body();
    return $rule, shift @rows, $rule, @rows, $rule
           if @rows;
}

The output is as expected. But performance-wise, it takes about ~30sec to process on about an output file of size ~5MB.

Is there a better way to achieve that in terms of performance?

Thanks!

Was it helpful?

Solution

There's several possible performance issues with that code. First is that in building $table you're pulling the entire data set into memory. This can consume a lot of memory. You'd be much better off formatting the data as it comes out. This means instead of using Text::Table->load you'd use Text::Table->add.

sub sql_command {
    my ($self,$sql) = @_;
    my $sth = $self->{_db}->prepare($sql) or die $!;
    $sth->execute() or die $!;

    return box_format_from_query($sth);
}

sub box_format_from_query {
    my $sth = shift;
    my $headers = [ map { defined $_ ? $_ : "undef" } @{$s->{'NAME'}}];
    my $num_cols = @$headers;

    my $table = Text::Table->new(\'| ', '', (\' | ','')x($num_cols-1), \' |+');
    while(my $row = $s->fetch) {
          $table->add(map { defined $_ ? $_ : "undef" } @$row );
    }

    my $rule = $tb->rule(qw/- +/);
    my @rows = $tb->body();
    return $rule, shift @rows, $rule, @rows, $rule
           if @rows;
}

The second performance problem is likely Text::Table itself. As the table methods work on copies of the data (that is, you don't pass in the data as references) its not likely to have been written with large data sets in mind. The only way to find out is to profile your code. Have a look at Devel::NYTProf. You should be able to figure out where your code is spending its time. If it turns out to be inside Text::Table methods, you should contact the author or consider another method of producing formatted output.

The third performance problem might be your query. You don't show your query, but its entirely possible its inefficient. Devel::NYTProf will tell you if you're spending a lot of time at $sth->execute. If so, you have another question to ask. :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top