RethinkDB: Getting random documents for each category

https://stackoverflow.com/questions/20980162

25-09-2022
|

Question

I have a table with ~8000 quiz questions. They are divided in 25 categories. Each category has a attribute max_questions which tells me how many questions i have to pick randomly to generate a quiz.

e.g

Category 1 -> 2 questions
Category 2 -> 3 questions
Category 3 -> 1 question

I came up with a solution, but i takes approx. 2 seconds to perform.

r.table('categories').pluck('id', 'max_questions').orderBy('id').run(conn, function(err, cursor) {
    if(err) return next(new Error(err.msg));

    cursor.toArray(function(err, categories) {
        if(err) return next(new Error(err.msg));

        async.concat(categories, function(category, callback) {
            r.table('questions').filter({category_id: category.id }).sample(category.max_questions).run(conn, callback);
        }, function(err, questions) {
            if(err) return next(new Error(err.msg));
            res.json(questions);
        });
    });
});

Is there a faster way to retrieve the questions with RethinkDB? Making 25 requests and calling 25 times .sample() for one quiz doesn't sound good to me.

I really appreciate your help!

Solution

It'll be much faster if you do it all in one query rather than making multiple requests to the database. Here's a single query that's more or less equivalent to what you wrote:

categories.map(function (doc) {
    return doc.merge(
        {"questions":
           questions
           .filter({category_id:doc("id")})
           .sample(doc("max_questions"))
           .coerceTo("ARRAY")})
})

Notice I've bound the tables to variables here so categories is bound to r.table("categories").

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow