Question

A while back I created a log parser. The logs can be several thousands of lines up to millions of lines. I store the parsed entries in an array of hash refs.

I am looking for suggestions on how to store my output, so that I can quickly read it back in if the script is run again (this prevents the need to re-parse the log).

The end goal is to have a web interface that will allow users to create queries (basically treating the parsed output like it existed within a database).

I have already considered writing the output of Data::Dumper to a file.

Here is an example array entry printed with Data::Dumper:

$VAR = 
          {
            'weekday' => 'Sun',
            'index' => 26417,
            'timestamp' => '1316326961',
            'text' => 'sys1  NSP
Test.cpp      1000
This is a example error message.
',
            'errname' => 'EM_TEST',
            'time' => {
                        'array' => [
                                     2011,
                                     9,
                                     18,
                                     '06',
                                     22,
                                     41
                                   ],
                        'stamp' => '20110918062241',
                        'whole' => '06:22:41',
                        'hour' => '06',
                        'sec' => 41,
                        'min' => 22
                      },
            'month' => 'Sep',
            'errno' => '2261703',
            'dayofmonth' => 18,
            'unknown2' => '1',
            'unknown3' => '1',
            'year' => 2011,
            'unknown1' => '0',
            'line' => 219154
          },
Is there a more efficient way of accomplishing my goal?

Thanks!

Was it helpful?

Solution

If your output is an object (or if you want to make it into an object), then you can use KiokuDB (along with a database back end of your choice). If not, then you can use Storable. Of course, if your data structure essentially mimics a CSV file, then you can just write the output to file. Or you can output the data into a JSON object that you can store in a file. Or you can forgo the middleman and simply use a database.

You mentioned that your data structure is a "array of hashes" (presumably you mean an array of hash references). If the keys of each hash reference are the same, then you can store this in CSV.

You're unlikely to get a specific answer without being more specific about your data.

Edit: Now that you've posted some sample data, you can simply write this to a CSV file or a database with the values for index,timestamp,text,errname,errno,unknown1,unknown2,unknown3, and line.

OTHER TIPS

use Storable;

# fill my hash
store \%hash, 'file';

%hash = ();
%hash = %{retrieve('file')};
# print my hash

You can always use KiokuDB, Storable or what have we, but if you are planning to do aggregation, using a relational data base (or some data store that supports queries) may be the best solution in the longer run. A lightweight data store with an SQL engine like SQLite that doesn't require running a database server could be a good starting point.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top