Question

I am entering a project to make a Opinion Mining (Data Mining -> Web Mining -> Opinion Mining) to get semantic orientation of the words contained. We will use a crawler to get the pages opinion. Now the question is, what type of DataBase should I use (OO, Relational, hierachycal, etc), is best to use in this type of project. I know this is a specific question, Im not expecting everybodies response but at least someone that already did it, that would help.

Regards!

Was it helpful?

Solution

If you need something large scale and responsive, you would probably need to go for Google's BigTable or something of that nature. At the prototype level, I am sure you can use traditional relational databases, but at certain point you'd hit the performance wall. See Brewer's CAP Theorem.

OTHER TIPS

From my experience in such kind of scenarios a relational database can serve your purpose pretty well. You need to be extra careful when storing the web content part of it - whether you want to at all use a database to store it or will storing on as simple as a file system can do. BLOBs specially require extra care and they increase your maintenance work.

Also based on the nature of the project, you would certainly be using a lot of already built in components etc. many of which would already support/easy to extend to use a relational DB as a data store.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top