Question

If I have a simple table named: conn_log in my Redshift/Postgresql

enter image description here

In the performance of view, what difference between these two commands ?

select t1.wifi_name, (t2.sconn*100)::numeric/t1.ttconn
from (select wifi_name, count(*) as ttconn 
      from conn_log 
      group by wifi_name) t1, 
     (select wifi_name, count(*) as sconn 
      from conn_log 
      where success = 1 
      group by wifi_name) t2
where t1.wifi_name = t2.wifi_name;

second query:

select t1.wifi_name, (t2.sconn*100)::numeric/t1.ttconn
from (select wifi_name, count(*) as ttconn 
      from conn_log 
      group by wifi_name) t1
join
     (select wifi_name, count(*) as sconn 
      from conn_log 
      where success = 1 
      group by wifi_name) t2
on t1.wifi_name = t2.wifi_name 
Was it helpful?

Solution

As for the difference between INNER JOIN...ON vs WHERE clause, there is a good answer here. There are several answers there and the accepted answer pretty much summarises it all.

However, I cannot but comment that you query can be rewritten to significantly improve the performance, like this:

select wifi_name
      ,sum(case when success = 1 then 1 else 0 end)*100/count(*) as success_rate
from conn_log 
group by wifi_name;

In PostgreSQL 9.4+, it is even simpler:

select
    wifi_name, 
    count(*) filter (where success = 1) / count(*) * 100 as success_rate
from conn_log 
group by wifi_name;
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top