Question

I am developing a firefox extension which is to serve as a user profiling and web personalization engine. It needs to store TF-IDF related data of web pages. My question is, which would produce faster simple search results?

a. Using a custom data structure and storing the entire data structure in a file, loading it to memory and querying it?

OR

b. Storing and querying and the data off an SQLite database?

It is safe to assume a worst case scenario of around 250,000 rows in one of the tables.

Was it helpful?

Solution

Your question basically boils down to:

a. Should I write my own custom implementation of a data storage system?

or

b. Should I use an off-the-shelf, proven data storage system?

I would say if you go with the first approach, that:

  • You'll obviously end up spending time writing this code. You need to weigh this vs the time you spend learning/writing code on top of an existing library
  • You'll inevitably start adding features over time. You'll have to continuously re-evaluate the cost of adding more code vs throwing away the work you've put in and using an existing library
  • You may possibly run into serious performance or other issues. Are you willing to take this risk when something like SQLite has already had a lot of production use to find these issues?
  • How much time are you going to spend dealing with bugs caused by your data storage, that could be avoided using an off the shelf library?

Another way of looking at this is: why would you NOT use SQLite? Is there some kind of problem with it for your scenario? I can't think of any.

I would certainly be inclined to start with SQLite (or something similar). If it proves to not work in some way, only after exhausting any other off the shelf alternatives would I consider writing my own data storage library.

OTHER TIPS

Why can't using some data structure like dictionary or binary tree.Base the data structure on number of search,retreivals,insert & delete.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top