Whats the best way to deal with big tables like 'thread_views'?
-
09-10-2019 - |
Question
I'm trying to make my mind up to add some statistics to my sites. For example: 'most viewed thread this day/week/year'.
I need a table that saves every view, related to an user (to avoid the same user adding many views) and thread, and with a timestamp.
But that would be one big table (in rows). Is this the way to go?
No correct solution
OTHER TIPS
The answer is dependant on a number of things, such as the number of threads, the number of views, what hardware you have, the typical load, the read/write ratio, the required accuracy etc etc.
A resonable answer to your question without knowing the details of your specific scenario is to create the table you are describing:
thread_views(
thread_id references thread(thread_id)
,user_id references user(user_id)
,timestamp
,primary key(thread_id, user_id)
)
The above approach gives you flexibility and good enough performance for the typical scenario.
I have recently answered two similar questions that you can have a look at to get some ideas.
Count article comments and Count visitor hits per day
A final point is that many of the major databases includes tools for aggregating data. Those tools lets you have a normalized data model while still gaining most of the benefits with a precomputed table of statistics.