Efficient way to combine results of two database queries
Question
I have two tables on different servers, and I'd like some help finding an efficient way to combine and match the datasets. Here's an example:
From server 1, which holds our stories, I perform a query like:
query = """SELECT author_id, title, text
FROM stories
ORDER BY timestamp_created DESC
LIMIT 10
"""
results = DB.getAll(query)
for i in range(len(results)):
#Build a string of author_ids, e.g. '1314,4134,2624,2342'
But, I'd like to fetch some info about each author_id from server 2:
query = """SELECT id, avatar_url
FROM members
WHERE id IN (%s)
"""
values = (uid_list)
results = DB.getAll(query, values)
Now I need some way to combine these two queries so I have a dict that has the story as well as avatar_url and member_id.
If this data were on one server, it would be a simple join that would look like:
SELECT *
FROM members, stories
WHERE members.id = stories.author_id
But since we store the data on multiple servers, this is not possible.
What is the most efficient way to do this? I understand the merging probably has to happen in my application code ... any efficient sample code that minimizes the number of dict loops would be greatly appreciated!
Thanks.
Solution
If memory isn't a problem, you could use a dictionary.
results1_dict = dict((row[0], list(row[1:])) for row in results1)
results2_dict = dict((row[0], list(row[1:])) for row in results2)
for key, value in results2_dict:
if key in results1_dict:
results1_dict[key].extend(value)
else:
results1_dict[key] = value
This isn't particularly efficient (n2), but it is relatively simple and you can tweak it to do precisely what you need.
OTHER TIPS
The only option looks to be Database Link, but is unfortunately unavailable in MySQL.
You'll have to do the merging in your application code. Better to keep the data in same database.
You will have to bring the data together somehow.
There are things like server links (though that is probably not the correct term in mysql context) that might allow querying accross different DBs. This opens up another set of problems (security!)
The easier solution is to bring the data together in one DB.
The last (least desirable) solution is to join in code as Padmarag suggests.
Is it possible to setup replication of the needed tables from one server to a database on the other? That way you could have all your data on one server.
Also, see FEDERATED storage engine, available since mysql 5.0.3.