Question

Do you think that using a MongoDB Json Database to store log files from application is a good idea and why ?

The only advantage for me is the schema abstraction, but i think it's also a weakness we cannot ensure the integrity of a log file.

Was it helpful?

Solution

Obviously I'm biased (I work on MongoDB) but I think it works very well for logs.

Reasons:

  • It's fast for inserts and updates... you can do thousands per second
  • As well as normal queries, you can run analytics and generate reports using JavaScript. You could have a cron job running nightly which does nice MapReduce things to your logs.
  • You can use capped collections, which are collection that act like queues, to keep only the latest N KBs/MBs/GBs of logs

I'm not sure what you mean "ensure the integrity of a log file"... do you mean you are worried about not knowing what fields the document you're pulling out has? If so, I think you'll find it's no harder dealing with null fields in a relational database and much more flexible.

See also: the MongoDB blog post on logging.

OTHER TIPS

I'm using MongoDB to store logs from many applications and it's working out very well so far.

You might want to take a look at the slides from a presentation on Logging Application Behavior to MongoDB that I gave at Mongo SV and at the last MongoDB SF Meetup for more background on why I think it is good for logging, as well as for info on libraries for Java, Python, Ruby, PHP and C# that support logging to MongoDB.

I'm now the main committer on log4mongo-java, Log4J appenders for MongoDB. So, it's probably not too surprising that that's what I'm using.

With respect to log integrity, I assume you mean confidence that it hasn't been modified after it was written. One option you have, at least with log4mongo-java, it to store logging events in a database that requires authentication. That would limit to some degree the number of users who could add, delete or update events.

In addition, you could set up a replication slave that is tightly locked down. Frequent backups of the slave would at least limit the time during which the set of logged events could be modified.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top