Question

i have the following code

function cron_day_counts()
{
    $subids = get_subids();
    array_push($subids, '');
    $from = '2011-10-19';
    $to = '2011-10-20';
    $days = days_interval($from, $to);
    $result_array = array();
    foreach ($subids as $subid)
    {
        for ($i = 0; $i < $days; $i++)
        {
            $date = date('Y-m-d', strtotime($from . '+ ' . $i . ' day'));
            $date_prev = date('Y-m-d', strtotime($date . '- 1 day'));

            $unique_id_query = mysql_query('SELECT (SELECT COUNT(DISTINCT `id`,`subid`) FROM `tb_stats` WHERE `date` <= \'' . $date . '\'' . (!empty($subid) && is_numeric($subid) ? ' AND `subid` = \'' . mysql_real_escape_string($subid) . '\'' : '') . ') - (SELECT COUNT(DISTINCT `id`,`subid`) FROM `tb_stats` WHERE `date` <= \'' . mysql_real_escape_string($date_prev) . '\'' . (!empty($subid) && is_numeric($subid) ? ' AND `subid` = \'' . mysql_real_escape_string($subid) . '\'' : '') . ') AS `unique_ids`');
            $unique_id_result = mysql_fetch_assoc($unique_id_query);

            $total_id_query = mysql_query('SELECT COUNT(DISTINCT `id`,`subid`) AS `total_ids` FROM `tb_stats` WHERE `date` = \'' . mysql_real_escape_string($date) . '\'' . (!empty($subid) && is_numeric($subid) ? ' AND `subid` = \'' . mysql_real_escape_string($subid) . '\'' : ''));
            $total_id_result = mysql_fetch_assoc($total_id_query);

            $unique_ip_query = mysql_query('SELECT (SELECT COUNT(DISTINCT `ip`,`subid`) FROM `tb_stats` WHERE `date` <= \'' . $date . '\'' . (!empty($subid) && is_numeric($subid) ? ' AND `subid` = \'' . mysql_real_escape_string($subid) . '\'' : '') . ') - (SELECT COUNT(DISTINCT `ip`,`subid`) FROM `tb_stats` WHERE `date` <= \'' . mysql_real_escape_string($date_prev) . '\'' . (!empty($subid) && is_numeric($subid) ? ' AND `subid` = \'' . mysql_real_escape_string($subid) . '\'' : '') . ') AS `unique_ips`');
            $unique_ip_result = mysql_fetch_assoc($unique_ip_query);

            $total_ip_query = mysql_query('SELECT COUNT(DISTINCT `ip`,`subid`) AS `total_ips` FROM `tb_stats` WHERE `date` = \'' . mysql_real_escape_string($date) . '\'' . (!empty($subid) && is_numeric($subid) ? ' AND `subid` = \'' . mysql_real_escape_string($subid) . '\'' : ''));
            $total_ip_result = mysql_fetch_assoc($total_ip_query);

            $global_query = mysql_query('SELECT COUNT(`id`) AS `global` FROM `tb_stats` WHERE `date` = \'' . mysql_real_escape_string($date) . '\'' . (!empty($subid) && is_numeric($subid) ? ' AND `subid` = \'' . mysql_real_escape_string($subid) . '\'' : ''));
            $global_result = mysql_fetch_assoc($global_query);

            $result = array();
            $result['subid'] = $subid;
            $result['date'] = $date;
            $result['unique_ids'] = $unique_id_result['unique_ids'];
            $result['total_ids'] = $total_id_result['total_ids'];
            $result['unique_ips'] = $unique_ip_result['unique_ips'];
            $result['total_ips'] = $total_ip_result['total_ips'];
            $result['global'] = $global_result['global'];

            $result_array[] = $result;
        }

    }
    //db insert
    return $result_array;
}

I want to move all the query out of the foreach and for loops, I believe it would work faster. I'm stuck o this, having no idea how to do this. Any help would be appreciated.

Was it helpful?

Solution

I would say at the very least you should combine the queries in the loop to just one for each day. So for a 5 day range you would have 5 queries.

Or you could have a single query for the whole date range and move it outside the loop (as described by ajreal). Then use PHP to sort it all out.

For large databases I would rather split up queries a little to balance out the load and risk of timeouts. Also helps keep the code maintainable.

You should also look at how your database is structured and indexed.

Is it noticeably slow?

and is the array_push function necessary? (not that it would save much, just wondered cause it looks redundant)

If its really slow then maybe think about restructuring the process completely depending on how you use it.

You could, for example, at 00:01 each day do something like this:

  • query the days log and count the unique/total IP/ID amounts
  • insert just the count numbers and date in to a separate table
  • archive the days log in to a separate archive table or even a separate db like mongoDB

This way you can perform simple querys to view the data and manipulate the numbers to your hearts content with good performance. And by archiving you're keeping the query table small by removing unecessary rows but maintaining a log if needed later.

Of course, this may not fit in with how your db is setup.

OTHER TIPS

get all the subid

for each table,
build a single query to filter between the smallest date, and largest date,
and group by the date

select subid, `date`, count(*) ... 
where subid IN($subids) and `date` between $smallest and $largest
group by subid, `date`

iterate the result, and stored the result into array, with subid, date as key

$mysql_results = array[$subid][$date] ...

lastly, iterate the $subids and date, like

foreach ($subids as $subid)
{
  for ($i = 0; $i < $days; $i++)
  {
     // set $date

     // check $mysql_results[$subid][$date] exists
  }
}

with something like above, you only required 5 queries instead of

5 x total days x size of the subids

Take all of your subids, and do a fetch with an IN predicate to get all of the values at once. Stuff that in to an array, then loop the array.

Use the PDO::MySQL extension instead of the MySQL or MySQLi extensions. This way, you can prepare the queries which will speed up considerably the execution time of the mysql calls.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top