Question

I am in the process of changing the underlaying database from a relational database to MongoDB, and I need to "recreate" the same semantics through MongoDB queries. All in all, this is going fine, with the exception of one thing: the SQL greatest() function:

SELECT * FROM my_table
WHERE (GREATEST(FIELD_A, FIELD_B, FIELD_C, FIELD_D)
BETWEEN some_value AND some_value)
AND FIELD_E = another_value;

I cannot seem to find an equivalent to this GREATEST() function. I am aware that it is possible to achieve somewhat similar functionality by using the $cond operator, but as the GREATEST() function here is finding the greatest of 4 values, this would be a lot of conditinals. Is there any other way of achieving this? I have had a look at both the aggregation framework and mapReduce, but I can't seem to find anything directly similar in the aggregation framework and I am having a hard time understanding the mapReduce framework.

Is this even possible to achieve? I would assume that the answer is yes, but I cannot really seem to find a reasonable equivalent way of doing it.

Was it helpful?

Solution 2

MongoDb doesn't currently have the equivalent to the GREATEST function. You could use a MapReduce, but it won't provide efficient immediate results. Additionally, you wouldn't effectively be able to return the other fields of the document. You'd need to do more than one query, or potentially duplicate all of the data. And, without running an update process for the results, it wouldn't be up to date as documents were modified, as a Map Reduce in MongoDb must be initiated manually.

The MongoDb aggregation framework wasn't intended for this pattern, and if it is possible, would result in a very lengthy pipeline. Also, it's currently limited to 16MB of results and doesn't easily return more than the fields you've aggregated. Returning select * requires a manual field projection, potentially more than once depending on the desired output.

Given that you want to return multiple fields, and the result isn't an aggregation, I'd suggest doing something far simpler:

Precompute the result of a call to the greatest function and store it in the document as a new field for easy access in a variety of queries.

OTHER TIPS

If you query you quoted is what you are trying to replicate, you can take a different route...

You want to find all documents that the greatest of 4 values between a range (plus other criteria).

You can rephrase this as documents that all 4 values are below the upper limit and at least one is above the lower. Something along the lines of:

find(
    {field_a:{$lt:some_upper_limit}
    ,field_b:{$lt:some_upper_limit}
    ,field_c:{$lt:some_upper_limit}
    ,field_d:{$lt:some_upper_limit}
    ,$or:
        [{field_a:{$gt:some_lower_limit}}
        ,{field_b:{$gt:some_lower_limit}}
        ,{field_c:{$gt:some_lower_limit}}
        ,{field_d:{$gt:some_lower_limit}}
        ]
    })

Probably a good idea to look at how indexes might help make this efficient, depending on the data, etc...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top