Question

LogParser isn't open source and I need this functionality for an open source project I'm working on.

I'd like to write a library that allows me to query huge (mostly IIS) log files, preferably with Linq.

Do you have any links that could help me? How does a program like LogParser work so fast? How does it handle memory limitations?

Was it helpful?

Solution

It probably process the information in the log as it reads it. This means it (the library) doesn't have to allocate a huge amount of memory to store the information. It can read a chunk, process it and throw it away. It is a usual and very effective way to process data.

You could for example work line by line and parse each line. For the actual parsing you can write a state machine or if the requirements allows it, use regex.

Another approach would be a state machine that both reads and parses the data. If for some reason a log entry spans more than one line this might be needed.

Some state machine related links:

A very simple state machine written in C: http://snippets.dzone.com/posts/show/3793

Alot of python related code, but some sections are universally applicable: http://www.ibm.com/developerworks/library/l-python-state.html

OTHER TIPS

If your aim is to query IIS log data with LINQ. Then i suggest you to move the Raw IIS Log data to database and query the database using LINQ. This blog post might help.

http://getsrirams.blogspot.in/2012/07/migrate-iislog-data-to-sqlce-4-database.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top