Handling Incredibly large JSON Document in CouchDB

https://stackoverflow.com/questions/19358613

30-06-2022
|

题

I'm new to NoSql databases and I'm having a hard time figuring how to handle a very large JSON Document that could amount to over 20MB on my local drive. This structure will definitely increase over time and I worry about the speed of queries and having to search deep though the returned JSON object nest just to get a string out. My JSON is deeply nested like so for example.

{
"exams": {
    "exam1": {
        "year": {
            "math": {
                "questions": [
                    {
                        "question_text": "first question",
                        "options": [
                            "option1",
                            "option2",
                            "option3",
                            "option4",
                            "option5"
                        ],
                        "answer": 1,
                        "explaination": "explain the answer"
                    },
                    {
                         "question_text": "second question",
                        "options": [
                            "option1",
                            "option2",
                            "option3",
                            "option4",
                            "option5"
                        ],
                        "answer": 1,
                        "explaination": "explain the answer"
                    },
                    {
                        "question_text": "third question",
                        "options": [
                            "option1",
                            "option2",
                            "option3",
                            "option4",
                            "option5"
                        ],
                        "answer": 1,
                        "explaination": "explain the answer"
                    }
                ]
            },
            "english": {same structure as above}
        },
        "1961": {}
    },
    "exam2": {},
    "exam3": {},
    "exam4": {}
}
}

In the main application, question objects are created and appended based on type of exam, year, and subject making the JSON document huge over time. How can I re-model this so as to avoid slow queries in the future?

解决方案

Dominic is right. You need to start dividing the documents and storing them as separate documents.

The next question is how to recompose the document after it's been split.

Considering you're using Couch, I would recommend doing this at the application layer. A good starting point would be to create exam documents and store them in their own database. Then have a document (exams) in another database that has pointers to the exam documents.

You can retrieve the exams document and get exams one by one as needed. This could be especially useful with paging since most people will only want to see the most recent exams.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow