Yes, the query could be rewritten to improve performance (though it looks like a query generated by Hibernate, and getting Hibernate to use a different query can be a challenge.)
How sure are you that this query is returning the resultset you expect? Because the query is rather odd.
In terms of performance, dollars to donuts, its the repeated executions of the dependent subqueries that are really eating your lunch, and your lunchbox, in terms of performance. It looks like MySQL is using the index on the deviceId
column to satisfy that subquery, and that doesn't look like the most appropriate index.
We notice that there are two JOIN operations to the devices table; there is no reason this table needs to be joined twice. Both JOIN operations require a match to the deviceID column of goldenvariation, the second join to the devices table does additional filtering with the isDeleted=0
. The keywords INNER
and CROSS
don't have any impact on the statement at all; and the second join to the devices
table isn't really a "cross" join, it's really an inner join. (We prefer to see the join predicates in an ON clause rather than the WHERE clause.
The DATE()
function wrapped around the rundate
column disables an index range scan operation. These predicates can be rewritten to take advantage of an appropriate index.
The DISTINCT(deviceId)
in the SELECT list of an EXISTS subquery is very strange. Firstly, DISTINCT
is a keyword, not a function. There's no need for parens around deviceId
. But beyond that, it doesn't matter what is returned in the SELECT list of the EXISTS subquery, it could just be SELECT 1
.
It's odd to see an EXISTS
predicate with a query that doesn't reference any expression in from the outer query (i.e. a correlated subquery). It's valid syntax. With a correlated subquery, MySQL performs that query for each and every row returned by the outer query. The EXPLAIN output looks like MySQL is doing the same thing, it didn't recognize any optimization.
The way those EXIST predicates are written, if there isn't a 'policy-options' row with 'MISMATCH' AND there isn't a 'policy-options' row with 'MISSING' (for the specified date, then the query will not return any rows. If a row of each type is found (for the specified date, then ALL of 'policy-options' rows for that date are returned. (It's syntactically valid, but it's rather odd.)
Assuming that the id
column on the devices table is UNIQUE (i.e. it's the PRIMARY KEY or there's a UNIQUE index on that column, then the DISTINCT keyword is unnecessary on the outermost query. (From the EXPLAIN output, it looks like MySQL already optimized away the usual operations, that is, MySQL recognized that the DISTINCT keyword is unnecessary.
But bottom line, it's the dependent subqueries that are killing performance; the absence of suitable indexes, and the predicate on the date column wrapped in a function.
To answer your question, yes, this query can be rewritten to return an equivalent resultset more efficiently. (It's not entirely clear that the query is returning the resultset you expect.)
SELECT d1.id AS id27_
, d1.createdTime AS createdT2_27_
, d1.deletedOn AS deletedOn27_
, d1.deviceAlias AS deviceAl4_27_
, d1.deviceName AS deviceName27_
, d1.deviceTypeId AS deviceT21_27_
, d1.equipmentVendor AS equipmen6_27_
, d1.exceptionDetail AS exceptio7_27_
, d1.hardwareVersion AS hardware8_27_
, d1.ipAddress AS ipAddress27_
, d1.isDeleted AS isDeleted27_
, d1.loopBack AS loopBack27_
, d1.modifiedTime AS modifie12_27_
, d1.osVersion AS osVersion27_
, d1.productModel AS product14_27_
, d1.productName AS product15_27_
, d1.routerType AS routerType27_
, d1.rundate AS rundate27_
, d1.serialNumber AS serialN18_27_
, d1.serviceName AS service19_27_
, d1.siteId AS siteId27_
, d1.siteIdA AS siteIdA27_
, d1.status AS status27_
, d1.creator AS creator27_
, d1.lastModifier AS lastMod25_27_
FROM devices d1
JOIN (SELECT g.deviceId
FROM goldenvariation g
CROSS
JOIN (SELECT 1
FROM goldenvariation x3
WHERE x3.goldenVariationType = 'MISMATCH'
AND x3.classType = 'policy-options'
AND x3.rundate >= '2014-04-14'
AND x3.rundate < '2014-04-14' + INTERVAL 1 DAY
LIMIT 1
) t3
CROSS
JOIN (SELECT 1
FROM goldenvariation x4
WHERE x4.goldenVariationType = 'MISSING'
AND x4.classType = 'policy-options'
AND x4.rundate >= '2014-04-14'
AND x4.rundate < '2014-04-14' + INTERVAL 1 DAY
LIMIT 1
) t4
WHERE g.classType = 'policy-options'
AND g.rundate >= '2014-04-14'
AND g.rundate < '2014-04-14' + INTERVAL 1 DAY
GROUP BY g.deviceId
) t2
ON t2.device_id = d1.id
WHERE d1.isDeleted=0