Many-to-many relationship with NoSQL database

Question 1

Firstly, picking between NoSQL and a SQL database is hard if you're not familiar with the basic principles. If this is the only data you are storing, go with a relational (SQL). If there is more data (which I assume) and it requires more of a interwoven schema, stick with NoSQL hands down.

I would take the relational route on this to keep it from getting too complex... start several collections; one for countries, region and so on. Don't get discouraged from doing relational (SQL) type schemas in a NoSQL database; most of the time they are the best solution.

Then, in each of the sub-groups, have a field which names the parent.

For example:

{
    {'name': 'United Kingdom'},
    {'name': 'United States'}
}

{
    {'name': 'England', 'parent': 'United Kingdom'},
    {'name': 'California', 'parent': 'United States'}
}

That way, your data-set doesn't get so nested that the returned data is unmanageable. Then you can grab the countries and the corresponding regions... etc with ease.

Best of luck!

EDIT: Answering OP's questions:

(Firstly, I'd recommend MongoDB - it's a great solution all around.)

Because when you start working with MongoDB, you'll realize that it stores data side by side on the hard drive. If you edit a huge record like that, it will most likely be pushed to the back of the disk, making your hard drive similar to Swiss cheese. Once you get to that point, you'll have to do a repair to condense it once more. Also, this way the data is more easily separated in your application, that way, if you need to do something with the data, you won't have to apply it to the entire object. I am assuming that you will have a large dataset since there are many different locations in the world.
Don't worry too much about that kind of thing. You can use ID's for the parent and match the children with the ID if you plan on changing names a lot. I just did it this way because I assumed you wouldn't need to change a location database.
Rather than an array, I would use a nested document to store multiple parents. That way, it can be more easily queried and indexed. I would use the following method:
```
{
    {
        'name': 'England,
        'parent': {
            1: 1,
            568: 1
        }
     }
 }
```

So that way you can employ your idea of indexes and find where db.region.$.568 = 1

Question 2

Because of a comment you made, I assume that you mean "MongoDB" when you say "NoSQL". There are a lot of other database technologies commonly referred to as NoSQL which are completely different, but this one seems to be the one you mean.

is not a good idea, because to get the whole taxonomy chain you will need to do multiple database queries, which should generally be avoided.
and 3. A single document which is a huge tree is not a good idea either, because MongoDB has a limit of 16MB per document. When you create huge, monolithic documents, you might hit that limit.

I think that MongoDB might not be the best solution for your use-case. Did you consider using a graph database? MongoDB is optimized for self-contained documents which stand on their own. But the focus of graph databases is on datasets where you have a lot of entities which are defined by their relations to other entities. This looks a lot like your use-case.