I think it would be better if you got a JOIN
between ITEM
and LOCATIONS
with a foreign key in ITEM
's table.
There will be too many redundancies of data if you duplicate data for every tiem
Question
I am creating a database with two main tables: items
, locations
.
The items
table contains approx 3mill records and is rising at a rate of 1mill records a month.
The locations
table contains 50,000 locations (name, latitude, longitude) and will not change in size.
Every read of the items
table will require a JOIN
to the locations
table to find out where the item is located unless i duplicate the location content for every item record. I anticipate around 5mill queries to the items table every month.
Searching of the database will be performed by Sphinx, so I do not need to worry about complicates mysql geodistance queries.
My question is, would I be better off duplicating the locations data for every item, or perform JOIN statements?
Thanks in advance
Solution
I think it would be better if you got a JOIN
between ITEM
and LOCATIONS
with a foreign key in ITEM
's table.
There will be too many redundancies of data if you duplicate data for every tiem
OTHER TIPS
We can discuss denormalisation from the academic point of view, but the practice always differs from theory. How you design your structure should also depend on the use - for you, I guess it's the responce time.
Joining to a 50k table is not very costly and will not take much time as long, as location
size is not rising
If you have a plenty of free space, denormalisation will always improve your queries but needlessly duplicate 50.000 of records, on the other hand you will loose speed you are looking for.