Im building a classifieds website of adverts where I want to store a count of the number of views of each advert which I want to be able to display in a graph at a later date by day and month etc.. for each user and each of their adverts. Im just struggling with deciding how best to implement the mysql database to store potentially a large amount of data for each advert.

I am going to create a table for the page views as follows which would store a record for each view for each advert, for example if advert (id 1) has 200 views the table will store 200 records:

Advert_id (unique id of advert)

date_time (date and time of view)

ip_address ( unique ip address of person viewing advert)

page_referrer (url of referrer page)

As mentioned I am going to create the functionality for each member of the site to view a graph for the view statistics for each of their adverts so they can see how many total views each of their adverts have had, and also how many views their advert has had each day (between 2 given dates) and also how many views per month each advert has had. I'll do this by grouping by the date_time field.

If my site grows quite large and for example has 40,000 adverts and each advert has on average 3,000 page views, that would mean the table has 120 Million records. Is this too large ? and would the mysql queries to produce the graphs be very slow?

Do you think the table and method above is the best way to store these advert view statistics or is there a better way to do this?

有帮助吗?

解决方案

Unless you really need to store all that data it would probably be better to just increment the count when the advert is viewed. So you just have one row for each advert (or even a column in the row for the advert).

Another option is to save this into a text file and then process it offline but it's generally better to process data as you get it and incorporate that into your applications process.

If you really need to save all of that data then rotating the log table weekly maybe (after processing it) would reduce the overhead of storing all of that information indefinitely.

其他提示

I was working with website with 50.000 unique visitors per day, and i had same table as you.

Table was growthing ~200-500 MB/day, but i was able to clean table every day.

Best option is make second table, count visitors every day, add result to 2nd table, and flush 1st table.

first table example:

  • advert_id
  • date & time
  • ip address
  • page referrer

second table example (for graph):

  • advert_id
  • date
  • visitors
  • unique visitors

Example SQL query to count unqiue visitors:

SELECT 
    advert_id,
    Count(DISTINCT ip_address), 
    SUBSTRING(Date,1,10) as Date 
FROM 
    adverts 
GROUP BY 
    advert_id, 
    Date

Problem is not even perfomance (MySQL ISAM Engine is quite smart and fast), problem is storage such big data.


90% statistics tools (even google analytics or webalyzer) is making graphs only once per day, not in real-time.

And quite good idea is store IP as INT using function ip2long()

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top