Frage

Quite simply, I need to store time series data in a document. I have decided that having a document responsible for a 30 minute period of data is reasonable. The document could look like this:

But this is only one of about a few hundred/thousand documents that will be updated every second.

{
    _id: "APAC.tky001.cpu.2011.12.04:10:00",
    field1: XX,
    field2: YY,
    1322971800: 22,
    1322971801: 23,
    1322971802: 21,

    // and so on
 }

This means that every 30 minutes, I create the document with _id, field1 and field2. Then, every second I would like to add a timestamp/value combination.

I am using the mongo c library, I was assuming it would be superfast but the way I am doing this requires an mongo_update which cannot be done in bulk. I don't think there's a way to use mongo_insert_batch.

Unfortunately, it's super slow - terrible performance. Am I doing this completely incorrectly? By terrible, I mean that by doing some crude work I get 600/second, in an alternate db (not naming names) I get 27,000/sec.

The code is approximately:

for (i=0;i<N;i++) {
    if (mongo_update(c,n,a,b,MONGO_UPDATE_UPSERT,write_concern) != MONGO_OK)
        // stuff
}

setting write concern off or on makes no difference.

War es hilfreich?

Lösung

Your updates are likely to grow documents out of bounds each time. This means that update is no longer cheap, because mongo has to copy the document to a new location. You could manually pad documents by inserting some large dummy value when creating the document and removing it later, so that your updates happen in-place. I'm not sure if you can manipulate collection-level paddingFactor directly.

In that another unnamed database you probably insert a row per entry, which is totally different operation from what you are doing here.

Andere Tipps

Mongo's latest c-driver does support bulk insert:

http://api.mongodb.org/c/current/bulk.html#bulk-insert

#include <assert.h>
#include <bcon.h>
#include <mongoc.h>
#include <stdio.h>

static void
bulk1 (mongoc_collection_t *collection)
{
   mongoc_bulk_operation_t *bulk;
   bson_error_t error;
   bson_t *doc;
   bson_t reply;
   char *str;
   bool ret;
   int i;

   bulk = mongoc_collection_create_bulk_operation (collection, true, NULL);

   for (i = 0; i < 10000; i++) {
      doc = BCON_NEW ("i", BCON_INT32 (i));
      mongoc_bulk_operation_insert (bulk, doc);
      bson_destroy (doc);
   }

   ret = mongoc_bulk_operation_execute (bulk, &reply, &error);

   str = bson_as_json (&reply, NULL);
   printf ("%s\n", str);
   bson_free (str);

   if (!ret) {
      fprintf (stderr, "Error: %s\n", error.message);
   }

   bson_destroy (&reply);
   mongoc_bulk_operation_destroy (bulk);
}

int
main (int argc,
      char *argv[])
{
   mongoc_client_t *client;
   mongoc_collection_t *collection;

   mongoc_init ();

   client = mongoc_client_new ("mongodb://localhost/");
   collection = mongoc_client_get_collection (client, "test", "test");

   bulk1 (collection);

   mongoc_collection_destroy (collection);
   mongoc_client_destroy (client);

   mongoc_cleanup ();

   return 0;
}
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top