Question

I have a very large number of objects (about 30k). What is the best way to store and access them? They all have a specific Id, but i also would like to filter and search for them with their name, category etc. Its a quite simple class, something like the following:

class objclass {
int id;
std::string name;
...
}

I was thinking about SQL, but I don't know if thats the best way.

Thanks in Advance! :)

Update: Thx Guys! I think i'll go with a vector then. And thx for clarifying that 30k isnt that big^^ For me, who never handled with such amounts of Data it seemed quite large ;)

Was it helpful?

Solution

std::vector sounds like a perfect fit. If you know in advance how many elements you get, use vector::reserve or vector::resize to not overallocate. Otherwise use vector::shrink_to_fit after lots of insertions.

To speed up searches on the id, sort the vector on it and use a binary_search/lower_bound.

If you have lots of strings with the same content, use a flyweight class. This can also substantially speed up string comparisons.

To search quickly on string members, get a vector of iterators into your container and sort those or go for a boost::multi_index.

A small calculation to back that up: assuming int is 4 bytes, your strings average 20 letters, 30 000 elements, makes roughly 2 megabyte. Nothing to worry about.

OTHER TIPS

30.000 objects isn't really a "very large number". As long as the objects themself aren't several kB in size, the whole set should still fit into RAM easily, so there is really no reason to use a database just because of the size.

You could store them all in an std::vector. When you need to search them efficiently, you could create a std::map or std::multimap for each field you want to search for which maps values to references to your objects.

There might, however, be other reasons to use a database besides the amount of data. For example when you have other programs (or multiple instances of the same program) which operate on the same data and want to keep the data synchronized between them. Or when you just want a reliable persistence layer. Which database to choose is really up to you. Your requirements (as far as you wrote them) are so generic that any database system should be able to handle them adequately. There might be some aspects of your project which make some databases more fitting than others, but you didn't mention any.

Some SQL-Database will probably be fine. 30k is not a "very large number", what makes you think that it is?

Unless your filter criterion are very complex you might also consider keeping everything in memory. That is if you don't need persistence of some kind, but your requirements are very vague.

So: if you want to go for convenience I'd choose SQL, if speed is very important I'd go for an in-memory version and custom filters. But that depends on the kind of data you have and a lot of other factors.

You should find a way which suits you the best. SQLite, MySQL databases could be used for large database requirements with C++

I think creating a database would be the best.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top