Вопрос

I'm designing an API and I am wondering which query is more efficient?

the embedded 1-n association:

profile = Profile.where('authenticated_devices.device_token'=> device_token, 'authenticated_devices.access_token'=> access_token).first

or the referenced 1-n association:

device = AuthenticatedDevice.where(device_token: device_token, access_token: access_token).first
profile = device.profile
profile.authenticated_device = device

I've done explains, and in the case of referenced it uses a BtreeCursor, in the case of embedded it uses a BasicCursor. Could the addition of indexes make the embedded 1-n query faster than the referenced 1-n? Also what are the pitfalls of this query? If I want absolute speed for my API is it better to use the embeddedd 1-n or the referenced 1-n? Lets imagine this API also has a heavy load.


Update:

I had this question: "The real decision for referencing to embedding depends on the amount of "related" data and how you intend to access it."

Answer: This is a very simple API. The API loads a current_user's Profile, which has all of their info, basically with every single API call I'll pass a device id and access token. This is the info embedded in the Profile model, what I call the Authenticated Device. It will be under heavy load. I'm trying to determine if I should stick with the embedded 1-N and add indexes, or move over to the referenced 1-n. So speed is my highest concern. Does this help?

Это было полезно?

Решение

The bottom line here is that for anything other than an embedded document you must make more than one query to the database. So in the references scenario you are finding the "child" in one request and then accessing the parent in other request. The code may hide this a bit, but it is actually two round trips to the database.

So the embedded model will be faster. What might be confusing you at the moment is the lack of an index present on your authenticated_devices.device_token field within your Profile model and collection. So with the index in place, then these look-ups are optimal.

It is true the another consideration here could be the cost of pulling the document that contains all of the "devices" in the embedded collection, but as long as the information is reasonably light it still should incur less overhead than an additional trip to the database as it were.

As a final point, if the information you were accessing from Profile is actually very light, then even though it might be against your sensibilities, the fastest possible way should very likely be to just replicate that information "per device" and use that in a single request rather than reference another document with another request.

So look at your usage patterns and consider the size of the data, but generally as long as you have indexes in place for your usage patterns there should be nothing wrong with the embedded model.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top