优化MySQL查询具有较大IN（）子句或加入上派生表

https://stackoverflow.com/questions/2091777

21-09-2019
|

题

让我们说我需要查询一个公司的员工。我有一个表中，“交易”，它包含在由每一个的交易数据。

CREATE TABLE `transactions` (
  `transactionID` int(11) unsigned NOT NULL,
  `orderID` int(11) unsigned NOT NULL,
  `customerID` int(11) unsigned NOT NULL,
  `employeeID` int(11) unsigned NOT NULL, 
  `corporationID` int(11) unsigned NOT NULL,
  PRIMARY KEY (`transactionID`),
  KEY `orderID` (`orderID`),
  KEY `customerID` (`customerID`),
  KEY `employeeID` (`employeeID`),
  KEY `corporationID` (`corporationID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

这是相当简单的查询这个表关联，但有一个奇怪的问题：交易记录登记每名员工的一次，所以有可能是多个记录每一个订单公司

。

例如，如果来自公司1员工A和B都参与销售吸尘器公司2，将有两个记录中的“交易”表;一个为每个员工，都为企业1.这必须不影响结果，虽然。从公司1，不管它的许多员工是如何涉及到贸易，必须被视为一个。

易，我想。我只会让上派生表的连接，像这样：

SELECT corporationID FROM transactions JOIN (SELECT DISTINCT orderID FROM transactions WHERE corporationID = 1) AS foo USING (orderID)

该查询返回谁一直在参与行业与企业1.这正是我需要的公司的名单，但因为MySQL不能使用corporationID索引来确定派生表这是非常缓慢的。据我所知，这是对于在MySQL所有子查询/派生表的情况。

我也试过单独查询orderIDs的收集和使用大的离谱IN（）子句（typhically 100个000多个编号），但事实证明MySQL有使用上的大的离谱IN指数（）子句的问题阱和作为结果的查询时间没有改善。

是否有任何其他选项可用，或者有我用尽他们两个？

解决方案

如果我理解你的要求，你可以试试这个。

select distinct t1.corporationID
from transactions t1
where exists (
    select 1
    from transactions t2
    where t2.corporationID =  1
    and t2.orderID = t1.orderID)
and t1.corporationID != 1;

或这样的：

select distinct t1.corporationID
from transactions t1
join transactions t2
on t2.orderID = t1.orderID
and t1.transactionID != t2.transactionID
where t2.corporationID = 1
and t1.corporationID != 1;

其他提示

您的数据是没有意义的，我想你正在使用corporationID，你的意思是在有某些时候客户ID，因为您的查询连接事务表基于单编号为corporationID = 1的交易表，以获得corporationIDs ...那么这将是1，是吗？

您可以请注明什么在客户，雇员和corporationIDs是什么意思？我怎么知道员工在A和B是从公司1 - ？在这种情况下，是公司1 corporationID，和企业2是客户，所以在存储在客户

如果是这样的话，你只需要通过做1组：

SELECT customerID
FROM transactions
WHERE corporationID = 1
GROUP BY customerID

（或者，如果你想每个订单，而不是每个客户一行一行的单编号选择和组。）

通过使用由组，则忽视的事实是有多个记录中的除了EMPLOYEEID重复。

相反，为了返回具有出售给企业2中的所有公司。

SELECT corporationID
FROM transactions
WHERE customerID = 2
GROUP BY corporationID

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow