Question

I have the requirement to pull MANY $slice parts of MANY documents.

At the moment I cannot see how to do with, so I have a Parallel Loop which pulls this data for me, however now I am wanting to retrive over 1000 nodes at a time, and this is not becoming performance, especially as the MongoDB C Sharp driver does not support Async queries.

This is the (example but similar) structure:

{ TransactionId: BinData, Outputs: [ { Data: BinData }, { Data: BinData } ] }

My code at the moment gets each individually, as so:

var outputs = new SpendableOutput[inputArray.Length];

Parallel.ForEach(inputArray, (input, s, i) =>
{
    var transactionQuery = Transactions.Find(Query<TransactionInfo>.EQ(x => x.Hash, input.PrevTxHash))
                                .SetFields(new FieldsBuilder().Slice("o", (int)input.PrevTxOutputIndex, 1)
                                                                .Include("_id"));
    var transaction = transactionQuery.ToArray();
    if (transaction.Length != 0)
    {
        outputs[i] = new SpendableOutput
        {
            TxHash = inputArray[i].PrevTxHash,
            Index = (int)inputArray[i].PrevTxOutputIndex,
            Output = transaction[0].Outputs[0]
        };
    }
});

As you can see, I am querying the Transactions collection, "o" field, which is an array of Outputs. From this I want a number of the Outputs. I realize I could ask for all Transactions based on the _id being in my list, and get the Outputs out after they have been retrieved, but many Transactions have very large Output lists from which I only need usually one or two items.

The first way of improving this is to get many slices from one document, such as this (does not work).

var transactionQuery = Transactions.Find(Query<TransactionInfo>.EQ(x => x.Hash, input.PrevTxHash))
                            .SetFields(new FieldsBuilder().Slice("o", itemNeededIndex1, 1)
                                                          .Slice("o", itemNeededIndex2, 1)
                                                          .Slice("o", itemNeededIndex3, 1).Include("_id"));

Secondly way (must preferred) is for me to send a batch of Query<> objects, as I have many (over 1000 usually) Transaction objects (with associated Outputs to retrieve).

Is there any MongoDB query expert that could suggest an approach to achieving these, or suggest an alternative approach I've not thought about?

EDIT:

The source of the Parent ID and Child Index is an external input which is defining which items could be required to clear a particular balance. They have the ID of the Parent and the index of the Child only. This consists of an array of a Byte Array (Parent Id/Hash) and int (Child Index).

Was it helpful?

Solution

I have determined that there is no way to achieve this, instead there must be many queries made up and fired at MongoDB.

Also, because the arrays are large it is worth normalizing this data into a new collection, however that does hurt my data size.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top