Domanda

I am trying to learn how to use RavenDB, and for that I created a basic example. It seems that initializing the store and querying takes a huge amount of time!

static void Main( string[] args )
{
    const bool createNewEntities = true;

    var sw = new Stopwatch();
    using( var store = new EmbeddableDocumentStore {DataDirectory = "~\\Data"} )
    {
        sw.Start();
        store.Initialize();
        sw.Stop();
        Console.WriteLine( "Initialized in {0} ms.", sw.ElapsedMilliseconds );

        if (createNewEntities)
        {
            sw.Reset();
            sw.Start();
            using( var session = store.OpenSession() )
            {
                sw.Stop();
                Console.WriteLine();
                Console.WriteLine( "Opened session in {0} ms.", sw.ElapsedMilliseconds );

                for( var i = 0; i < 10; i++ )
                {
                    var entity = new EntityA( "Entity A " + DateTime.Now.ToLongTimeString() );

                    sw.Reset();
                    sw.Start();
                    session.Store( entity );
                    sw.Stop();

                    if (i < 3)
                        Console.WriteLine( "Stored '{0}' in {1} ms.", entity.Name, sw.ElapsedMilliseconds );
                }

                sw.Reset();
                sw.Start();
                session.SaveChanges();
                sw.Stop();
                Console.WriteLine( "Saved changes in {0} ms.", sw.ElapsedMilliseconds );
            }
        }


        sw.Reset();
        sw.Start();
        using( var session = store.OpenSession() )
        {
            sw.Stop();
            Console.WriteLine();
            Console.WriteLine( "Opened EntityA session in {0} ms.", sw.ElapsedMilliseconds );

            sw.Reset();
            sw.Start();
            var entities = session.Query<EntityA>().ToArray();
            sw.Stop();
            Console.WriteLine("Queried for all {0} EntityA in {1} ms.", entities.Length, sw.ElapsedMilliseconds);
        }


        sw.Reset();
        sw.Start();
        using( var session = store.OpenSession() )
        {
            sw.Stop();
            Console.WriteLine();
            Console.WriteLine( "Opened EntityA session (again) in {0} ms.", sw.ElapsedMilliseconds );

            sw.Reset();
            sw.Start();
            var entities2 = session.Query<EntityA>().ToArray();
            sw.Stop();
            Console.WriteLine( "Queried (again) for all {0} EntityA in {1} ms.", entities2.Length, sw.ElapsedMilliseconds );
        }
    }


    Console.WriteLine();
    Console.WriteLine();
    Console.WriteLine( "Press ENTER to exit..." );
    Console.ReadLine();
}

This produces the following output:

Initialized in 6132 ms.

Opened session in 3 ms.
Stored 'Entity A 08:50:14' in 129 ms.
Stored 'Entity A 08:50:15' in 0 ms.
Stored 'Entity A 08:50:15' in 0 ms.
Saved changes in 29 ms.

Opened EntityA session in 0 ms.
Queried for all 10 EntityA in 463 ms.

Opened EntityA session (again) in 0 ms.
Queried (again) for all 10 EntityA in 1 ms.

From this crude example, I can see that:

  • Initializing the store takes a huge amount of time!!
  • Storing the first entity (of ten) takes quite some time.
  • Querying for all entities takes a lot of time the first time, but no time at all the second time.

How do I properly query the DB for all documents of a certain type (EntityA)? Surely, it cannot be that RavenDB requires an index for every query? Especially not for queries without any criteria?

(Note: I intend to use the DB embedded in a desktop application, where listing all documents is used to display the contents of the DB.)

È stato utile?

Soluzione

Here are the reasons for the three delays:

Initialization Delay
Initializing a document store is indeed one of the most expensive operations. Since you are running the embedded mode of RavenDB, it not only has to set up the connection to the database, but it actually has to start the database running as well. On my machine (a 2.3Ghz i5 laptop), it took 2516ms to initialize.

If you were running a full RavenDB server (not embedded) - the bulk of the delay would be when starting the server itself. Initializing the client would be significantly faster.

This is reasonable behavior, considering that IDocumentStore (whether embedded or normal) is meant to be kept as a singleton. There should only ever be one instance of this in your application, and it should be created on startup and disposed on shutdown.

First Store Delay
Because you are not providing an Id of your own, Raven is auto-generating one for you using its HiLo generation algorithm. This involves allocating a block of assignable ids from the database, which does take a minor amount of time. Subsequent calls will be faster, because they don't have to hit the database until the block has been used up.

If you supply your own Id property and fill it with a valid identifier such as entities/1, entities/2, etc. - then it would be much faster because you will skip over the key generation.

Query Delay
The first call to .Query<T>() when you do not specify a static index will try to create a dynamic index that matches the query expression. This is true even when getting "all" entities, because it still has to filter by the entity type using the Raven-Entity-Name metadata. Collections in RavenDB are a virtual thing, determined by the metadata. The documents actually live all together - so there's no other way to get all items in a "collection" than querying and filtering by the metadata.

Part of the delay you are seeing is the dynamic index being constructed. Then there is a delay for the items to be indexed. Note that if you added more items (say a few hundred), you would still get about the same delay, but you would not get back all items. The index would be stale since it was just created, and Raven will return only a small number of them. In a test like yours, you would probably want to explicitly wait for non-stale results. In a real application, you would probably want to pre-define a static index instead. You could in-fact speed up your query by going against a static index. The delay would be moved to the time of index creation instead of time of query.

If you want to avoid using an index at all, there is another way:

session.Advanced.LoadStartingWith<EntityA>("EntityAs/");

This method doesn't use the metadata to filter - it uses the key name itself. It goes against the document store directly without a query - so it is much faster. You will need to paginate to get lots of results - but you have the same concern with querying anyway. But with this method, the default page size is much smaller (25) - so you will most definitely run into that sooner than later.

I hope this answered your concerns. Please let me know in comments if you have others.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top