Вопрос

I'm working on a URL shortener project with PHP & MYSQL which tracks visits of each url. I've provided a table for visits which mainly consists of these properties :

time_in_second | country | referrer |  os   | browser | device | url_id
#####################################################################
1348128639     |    US   |   direct |  win  | chrome  | mobile | 3404  
1348128654     |    US   |   google | linux | chrome  | desktop| 3404  
1348124567     |    UK   |   twitter| mac   | mozila  | desktop| 3404  
1348127653     |    IND  |   direct | win   | IE      | desktop| 3465  

Now I want to make a query on this table. for example I want to get visits data for the url with url_id=3404. Because I should provide statistics and draw graphs, for this url, I need these data:

  • Number of each kind of OS for this URL , for example 20 windows, 15 linux , ...
  • Number of visits in each desired period of time , for example each 10 minutes in past 24 hour
  • Number of visits for each country
  • ...

As you see, some data like country may accept lots of different values.

One good idea which I can imagine is to make query which outputs number of each unique value in each column, for example in the country case for the data given above, on column for num_US, one for num_UK, and one for num_IND.

Now the question is how to implement such a high-performance query in sql (MYSQL) ?

Also if you think this is not an efficient query for performance, what's your suggestion?

Any help will be appreciated deeply.

UPDATE: look at this question : SQL; Only count the values specified in each column . I think this question is similar to mine , but the difference is in variety of values possible (as lots of values are possible for country property) for each column which makes the query more complex.

Это было полезно?

Решение

It looks like you need to do more than one query. You probably could write one query with different parameters but that would make it complex and hard to maintain. I would approach it as multiple small queries. So for each requirement I make a query and call them separately or individually. For example, if you want the country query you mentioned, you could do the following

SELECT country, count (*) FROM <TABLE_NAME> WHERE url_id = 3404 GROUP BY Country

By the way, I have not tested this query, so it may be inaccurate, but this is just to give you an idea. I hope this helps.

Also, another suggestion is to use Google Analytics, look into it, they do have a lot of what you already are implementing, maybe that helps as well.

Cheers.

Другие советы

Each of these graphs you want to draw represents a separate relation, so my off-the-cuff response is that you can't build a single query that gives you exactly the data you need for every graph you want to draw.

From this point, your choises are:

  1. Use different queries for different graphs
  2. Send a bunch of data to the client and let it do the required post-processing to create the exact sets of data it needs for different graphs
  3. farm it all out to Google Analytics (a la @wahab-mirjan)

If you go with option 2 you can minimize the amount of data you send by counting hits per (10-minute, os, browser, device, url_id) tupple. This essentially removes all duplicate rows and gives you a count. The client software would take these numbers and further reduce them by country (or whatever) to get the numbers it needs for a graph. To be honest though, I think you're buying yourself extra complexity for not very much gain.

If you insist on doing this yourself (instead of using a service) then go with a different query for each kind of graph. Start with a couple of reasonable indexes (url_id and time_in_second are obvious starting points). Use the explain statement (or whatever your database provides) to understand how each query is executed.

Sorry, I am new to Stack Overflow and having a problem with comment formatting. Here is my answer again, hopefully it workds now:

Not sure how it is poor in performance. The way I am thinking is you will end up with a table that looks like this:

country | count 
################# 
     US | 304 
     UK | 123 
     UK | 23 

So when you group by country, and count, it will be one query. I think this will get you going in the right direction. In any case, it is just an opinion, so if you find another approch, I am interested in knowing it as well.

Apologies about the comment messup up there..

Cheers

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top