문제

I am creating a database with two main tables: items, locations.

The items table contains approx 3mill records and is rising at a rate of 1mill records a month.

The locations table contains 50,000 locations (name, latitude, longitude) and will not change in size.

Every read of the items table will require a JOIN to the locations table to find out where the item is located unless i duplicate the location content for every item record. I anticipate around 5mill queries to the items table every month.

Searching of the database will be performed by Sphinx, so I do not need to worry about complicates mysql geodistance queries.

My question is, would I be better off duplicating the locations data for every item, or perform JOIN statements?

Thanks in advance

도움이 되었습니까?

해결책

I think it would be better if you got a JOIN between ITEM and LOCATIONS with a foreign key in ITEM's table.

There will be too many redundancies of data if you duplicate data for every tiem

다른 팁

We can discuss denormalisation from the academic point of view, but the practice always differs from theory. How you design your structure should also depend on the use - for you, I guess it's the responce time.

Joining to a 50k table is not very costly and will not take much time as long, as location size is not rising

If you have a plenty of free space, denormalisation will always improve your queries but needlessly duplicate 50.000 of records, on the other hand you will loose speed you are looking for.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top