Question

I have a query that is taking 48 seconds to execute as follows:

SELECT count(DISTINCT tmd_logins.userID) as totalLoginsUniqueLast30Days 
FROM tmd_logins
join tmd_users on tmd_logins.userID = tmd_users.userID 
where tmd_users.isPatient = 1 AND loggedIn > '2011-03-25' 
and tmd_logins.userID in 
    (SELECT userID as accounts30Days FROM tmd_users
    where isPatient = 1 AND created > '2012-04-29' AND computerID is null)

When I remove the DISTINCT keyword it takes less than 1 second, so it seems the bottle neck lies within that.

The database adds an entry to the tmd_logins table every time a user logs into the system. I am trying to get a total count of all users that are patients that have been created and logged in within a given period of time, such as in the last 30 days.

I have tried removing the DISTINCT keyword and adding group by tmd_logins.userID to the statement but the performance issue remains.

Table tmd_logins has about 300,000 records, tmd_users has about 40,000

Is there a better way of doing this?

Was it helpful?

Solution

The problem that you have is the execution plan. My guess is that the "in" clause might be confusing it. You might try:

SELECT count(DISTINCT tmd_logins.userID) as totalLoginsUniqueLast30Days 
FROM tmd_logins join
     tmd_users
     on tmd_logins.userID = tmd_users.userID join
     (SELECT distinct userID as accounts30Days
      FROM tmd_users
      where isPatient = 1 AND
            created > '2012-04-29' AND
            computerID is null
     ) t
     on tmd_logins.userID = t.accounts30Days
where tmd_users.isPatient = 1 AND
      loggedIn > '2011-03-25' 

That might or might not work. However, I'm wondering about the structure of the query itself. It would seem that UserID should be distinct in a table called tmd_users. If so, then you can wrap all your conditions into one:

SELECT count(DISTINCT tmd_logins.userID) as totalLoginsUniqueLast30Days 
FROM tmd_logins join
     tmd_users
     on tmd_logins.userID = tmd_users.userID 
where tmd_users.isPatient = 1 AND
      loggedIn > '2011-03-25' and
      created > '2012-04-29' AND
      computerID is null

If my guess is true, then this should definitely run faster.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top