How do I optimize RavenDB queries for retrieving all documents?

Question

Here are the reasons for the three delays:

Initialization Delay
Initializing a document store is indeed one of the most expensive operations. Since you are running the embedded mode of RavenDB, it not only has to set up the connection to the database, but it actually has to start the database running as well. On my machine (a 2.3Ghz i5 laptop), it took 2516ms to initialize.

If you were running a full RavenDB server (not embedded) - the bulk of the delay would be when starting the server itself. Initializing the client would be significantly faster.

This is reasonable behavior, considering that IDocumentStore (whether embedded or normal) is meant to be kept as a singleton. There should only ever be one instance of this in your application, and it should be created on startup and disposed on shutdown.

First Store Delay
Because you are not providing an Id of your own, Raven is auto-generating one for you using its HiLo generation algorithm. This involves allocating a block of assignable ids from the database, which does take a minor amount of time. Subsequent calls will be faster, because they don't have to hit the database until the block has been used up.

If you supply your own Id property and fill it with a valid identifier such as entities/1, entities/2, etc. - then it would be much faster because you will skip over the key generation.

Query Delay
The first call to .Query<T>() when you do not specify a static index will try to create a dynamic index that matches the query expression. This is true even when getting "all" entities, because it still has to filter by the entity type using the Raven-Entity-Name metadata. Collections in RavenDB are a virtual thing, determined by the metadata. The documents actually live all together - so there's no other way to get all items in a "collection" than querying and filtering by the metadata.

Part of the delay you are seeing is the dynamic index being constructed. Then there is a delay for the items to be indexed. Note that if you added more items (say a few hundred), you would still get about the same delay, but you would not get back all items. The index would be stale since it was just created, and Raven will return only a small number of them. In a test like yours, you would probably want to explicitly wait for non-stale results. In a real application, you would probably want to pre-define a static index instead. You could in-fact speed up your query by going against a static index. The delay would be moved to the time of index creation instead of time of query.

If you want to avoid using an index at all, there is another way:

session.Advanced.LoadStartingWith<EntityA>("EntityAs/");

This method doesn't use the metadata to filter - it uses the key name itself. It goes against the document store directly without a query - so it is much faster. You will need to paginate to get lots of results - but you have the same concern with querying anyway. But with this method, the default page size is much smaller (25) - so you will most definitely run into that sooner than later.

I hope this answered your concerns. Please let me know in comments if you have others.