Question

Alright, I totally hate myself for asking this; but I'm looking for another way to do this query, something that might be faster as well as more elegant (this looks like crap). Tell me what you think please:

  SELECT TRUNCATE(SUM(sub.Total),3) AS GrpTotal, sub.ActualDate, 
         TRUNCATE(SUM(sub.BonusAmt),3) AS GrpBonusAmt, sub.UID, 
         sub.CUSTID, YEAR(MIN(sub.ActualDate)) AS Year, pusers.username
  FROM ( SELECT a.UID, a.ActualDate, 'Global Report' AS Report, 
                SUM(a.totalpayment) AS Total, a.CUSTID,
               ((SUM(a.totalpayment)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmt
         FROM `globalreport` a
              LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Amex Residuals' AS Report, 
                SUM(a.payment) AS Total, a.CUSTID,
                ((SUM(a.payment)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
         FROM `amexresiduals` a
              LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL 
         SELECT a.UID, a.ActualDate, 'Compliance Fee' AS Report, 
                SUM(a.profit) AS Total, a.CUSTID, 
                ((SUM(a.profit)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
         FROM `compliancefee` a
              LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Checks On Demand' AS Report, 
                SUM(a.myprofit) AS Total, a.CUSTID, 
                ((SUM(a.myprofit)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
         FROM `geticheck` a
              LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Gift Cards on Demand' AS Report, 
                SUM(a.payment) AS Total, a.CUSTID,
                ((SUM(a.payment)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
         FROM `gcod` a
              LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Global Check' AS Report, 
                SUM(a.myprofit) AS Total, a.CUSTID,
                ((SUM(a.myprofit)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
         FROM `globalcheck` a
              LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Bonus True Up' AS Report, 
                SUM(a.finalpayment) AS Total, a.CUSTID,
                ((SUM(a.finalpayment)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
         FROM `bonustrueup` a
              LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Bonus Take Back - Did Not Activate' AS Report, 
                SUM(a.amount) AS Total, a.CUSTID,
                ((SUM(a.amount)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt 
         FROM `bonusadjnosetup` a
              LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Bonus Take Back - Closed Less Than 6 Months' AS Report, 
                SUM(a.amount) AS Total, a.CUSTID,
                ((SUM(a.amount)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
         FROM `bonusadjclosed6mo` a
             LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Month End Fee Rejects' AS Report, 
                SUM(a.amount) AS Total, a.CUSTID, SUM(a.amount) AS BonusAmnt        
         FROM `merchantloss` a
             LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Direct ACH Debits and Credits' AS Report, 
                SUM(a.amount*-1) AS Total, a.CUSTID, SUM(a.amount*-1) AS BonusAmnt
         FROM `dirachdebcred` a
             LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
         UNION ALL
         SELECT a.UID, a.ActualDate, 'Merchant Adjustments' AS Report, 
                SUM(a.amount) AS Total, a.CUSTID,
                ((SUM(a.amount)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
         FROM `merchantadj` a
             LEFT JOIN `users` u ON u.uid = a.UID
         WHERE true AND a.CUSTID = 1020
         GROUP BY a.ActualDate, a.UID
  ) sub
      LEFT JOIN `pending_users` pusers ON pusers.UID = sub.UID
  WHERE sub.CUSTID = 1020
       AND sub.`UID` NOT IN 
                ( SELECT `UID` 
                  FROM `users` 
                  WHERE `is_admin` AND `company_id` = sub.`CUSTID`)
  GROUP BY sub.ActualDate, sub.UID, sub.Report
  ORDER BY sub.ActualDate ASC

Clearly, this is a lengthy query. I'm just not sure it has to be. Essentially, I'm gathering and summing a different column in each of the unioned tables and grouping by that amount at the very end so I can get a total sum from all tables.

Was it helpful?

Solution

UPDATE

As if my previous answer wasn't long enough ...

I'd also move the JOIN to the users table from the inner queries to the outer queries, it looks like it's identical in every case.

But, we need that for the calculation of BonusAmnt (which appears to always be based on whatever Total is; the calculation is the same for all of them except for two. So, I also moved the BonusAmnt calculation to the outer query as well, and just adding a check for which calculation to do, using a CASE expression.

I may have missed something.

But here's how I'd write the query.

I'll show it in two pieces, with the query for the sub inline view separate from the outer query.

  SELECT TRUNCATE(sub.Total,3) AS GrpTotal
       , sub.ActualDate
       , CASE
           WHEN sub.Report IN ('Month End Fee Rejects','Direct ACH Debits and Credits') THEN
             TRUNCATE(sub.Total,3)
           ELSE
             TRUNCATE(sub.Total*IFNULL((u.retention_percent/100),1)+IFNULL(u.bonus_amount,0)),3)
         END AS GrpBonusAmt
       , sub.UID
       , sub.CUSTID
       , YEAR(sub.ActualDate) AS Year
       , pusers.username
    FROM (
           -- query to produce sub goes here
         ) sub
    LEFT
    JOIN `users` u
      ON u.uid = sub.UID
    LEFT
    JOIN `pending_users` pusers
      ON pusers.UID = sub.UID
    LEFT
    JOIN `users` n
      ON n.UID = sub.CUSTID
     AND n.is_admin
   WHERE n.UID IS NULL
   ORDER BY sub.ActualDate, sub.UID, sub.Report

And here is the second part. This is the query that gets stuck into the middle of the part above, as an inline view. This is the query that that produces sub row source for the outer query:

           SELECT 'Global Report' AS Report
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.totalpayment) AS Total
             FROM `globalreport` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Amex Residuals'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.payment) AS Total
             FROM `amexresiduals` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Compliance Fee'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.profit) AS Total
             FROM `compliancefee` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Checks On Demand'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.myprofit) AS Total
             FROM `geticheck` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Gift Cards on Demand'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.payment) AS Total
             FROM `gcod` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Global Check' AS Report
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.myprofit) AS Total
             FROM `globalcheck` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Bonus True Up'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.finalpayment) AS Total
             FROM `bonustrueup` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Bonus Take Back - Did Not Activate'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.amount) AS Total
             FROM `bonusadjnosetup` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Bonus Take Back - Closed Less Than 6 Months'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.amount) AS Total
             FROM `bonusadjclosed6mo` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Month End Fee Rejects'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.amount) AS Total
             FROM `merchantloss` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Direct ACH Debits and Credits'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.amount*-1) AS Total
             FROM `dirachdebcred` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID
            UNION ALL
           SELECT 'Merchant Adjustments'
                , a.UID, a.ActualDate, a.CUSTID
                , SUM(a.amount) AS Total
             FROM `merchantadj` a
            WHERE a.CUSTID = 1020
            GROUP BY a.ActualDate, a.UID

OTHER TIPS

Since you're getting results from several different tables, your query actually looks really good. I agree that the length of the SQL text is a bit daunting.

The overall approach looks like it's going to be the most efficient way to get the specified result. I'm assuming that the inner queries against the individual tables are collapsing a lot of rows.

In terms of performance, about the only thing I would change is that NOT IN predicate, since that's going to cause the subquery to be executed for each row.

You could get an equivalent result using an anti-join pattern, assuming that UID is a primary key in users table, or, that it's at least guaranteed to be NOT NULL. (I'm assuming that the subquery doesn't return NULL; if it did, the NOT IN predicate wouldn't return TRUE, and you wouldn't get any rows back at all.)

So this:

  WHERE sub.CUSTID = 1020
       AND sub.`UID` NOT IN 
                ( SELECT `UID` 
                  FROM `users` 
                  WHERE `is_admin` AND `company_id` = sub.`CUSTID`)

can be replaced with an equivalent but (usually) more efficient:

   LEFT
   JOIN `users` n
     ON n.is_admin
    AND n.company_id = sub.`CUSTID`
    AND n.`UID` = sub.`UID`
  WHERE n.`UID` IS NULL
    AND sub.CUSTID = 1020

The anti-join pattern is looking for a matching row, and then excluding any rows that did have a match, so what we're left with is just the rows that didn't match.

Actually, there's no need check that sub.CUSTID = 1020, that can be omitted. We're already guaranteed it's going to be 1020 by each of the queries in the sub inline view.


In the SELECT list of the outer query, I think this MIN aggregate adds some confusion:

YEAR(MIN(sub.ActualDate))

It just looks unnecessary, since the query is doing a GROUP BY sub.ActualDate, and since the YEAR function is deterministic (i.e. the values returned by YEAR(foo) are going to be equal for equal values of foo. I'd replace that with just:

YEAR(sub.ActualDate)

Actually, I don't think it's necessary to perform a GROUP BY operation at all on the outer query.

The value returned for the Report column of sub are distinct to each of the UNION ALL queries. And each of those queries has already performed a GROUP BY UID, ActualDate. So, the (UID,ActualDate,Report) tuple from sub is already guaranteed to be unique.

The only question there is whether UID is unique in the pending_users table. If it's not, then the totals the query is returning are getting doubled, tripled, etc., and just picking the username value from one matching row. But I don't think that the query is doing that, I strongly suspect that UID is unique in pending_users. (There aren't any columns from pending_users included in the outermost GROUP BY.) Aside from matching multiple rows from pending_users...

The query has already guaranteed that there aren't any rows to be collapsed by that GROUP BY on the outermost query. It looks like all that GROUP BY is getting you is a sort operation. That could be replaced with an ORDER BY.

That also means that the SUM aggregates (in the SELECT list of the outermost query) aren't necessary. For example, this expression:

TRUNCATE(SUM(sub.Total),3) 

could be replaced with:

TRUNCATE(sub.Total,3)

and return an equivalent result.


In terms of performance... having suitable indexes available is important. in the EXPLAIN, we'd rather not see Using filesort for the GROUP BY operations of those inner queries. We'd much rather see Using index.

For those inner queries, ideally we'd see an indexes like this:

... ON merchantloss (CUSTID, ActualDate, UID, amount)

There's an equality predicate on CUSTID, and there's a GROUP BY operation on ActualDate,UID. Having the columns in this order (almost) guarantees that MySQL will use the index to perform the GROUP BY and avoid a Using filesort operation. Including the other columns referenced in the query makes it a covering index, meaning the query can be satisfied entirely from the index without any need to visit the pages in the underlying table.


One final note... it might be just a tiny bit more efficient to do that is_admin exclusion (using the anti-join pattern) on each of the individual inner queries, if that limits the number rows being returned from sub. But I suspect it won't make much difference, and it might even be slower... I'd leave it on the outermost query, if just for the sake of avoiding repeating that multiple times.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top