Question

I am developing a Rails application that will access a lot of RSS feeds or crawl sites for data (mostly news). It will be something like Google News but with a different approach, so I'll store a lot of news (or news summaries), classify them in different categories and use ranking and recommendation techniques.

  • Should I go with MySQL?

  • Is it worthwhile using IBM DB2 purexml to store the doucuments? Also Ruby search implementations (Ferret, Ultrasphinx and others) are not needed If I choose DB2. Is that correct?

  • What are the advantages of PostreSQL in this?

  • Does it makes sense to use Couch DB in this scenario?

I'd like to choose the best option but without over-complicating the solution. So I discarded the idea to use two different storage solutions (one for the news documents and other for the rest of the data). I'm also considering only "free" options, so I didn't look at Oracle or MS SQL Server.

Thanks in advance.

Was it helpful?

Solution

purexml is heavier than SQL, so you pay more for your roundtrip between webserver and DB. If you plan to have lots of users, I'd avoid it, your better off letting your webserver cache the requests, thus avoiding creating xml(rss) everytime, if that is what you are thinking about.

I'd go with MySQL because its really good at serving and its totally free, well PostgreSQL is too, but haven't used it so I can't say.

CouchDB could make sense, but not if you plan on doing OLAP (Offline Analysis) of your data, a normal RDBMS will be better at it.

OTHER TIPS

Admitting firstly that I generally don't like mysql, I will say that there has been writing on this topic regarding postgres:

http://oldmoe.blogspot.com/2008/08/101-reasons-why-postgresql-is-better.html

This is always my choice when I need a pure relational database. I don't know whether a document database would be more appropriate for your application without knowing more about it. It does sound like it's something you should at least investigate.

MySQL is probably one of the best options out there; light, easy to install and maintain, multiplatform and free. On top of that there are some good free client tools.

Something to think about; because of the nature of your system you will probably have some tables that will grow quite a lot very quickly so you might want to think about performance.

Thus, MySQL supports vertical partitioning but only from V 5.1. Keep that in mind.

Cheers,

Jacobo.

It sounds to me the application you will build can easily become a large-scale web app. I would suggest PostgreSQL, for it has been known for its reliability.

You can check out the following link -- Bob Ippolito from MochiMedia tells us why they ditched MySQL for PostgreSQL. Although the posts are more than 3 years old, the issues MySQL 5.1 has recently tend to prove that they are still relevant.

http://bob.pythonmac.org/archives/category/sql/mysql/

MySQL is good in production. I haven't used PostgreSQL for rails, but it's a good solution as well.

In the dev and test environments I'd start out with SQLite (default), and perhaps migrate to your target DB in the test environment as you move closer to completion.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top