Pregunta

I got a situation where I want to get the data from one database table which are not in other database table. For that I am using NOT IN clause. This works fine for small amount of data. When tables have large data it takes huge time. I checked for 2000 rows, it is taking more than 8 minutes. I think huge time is because of NOT IN clause. Please anyone suggest alternate method to do this ASAP.

Edit: There is small change in schema. In dept_project_tasks I have ids of association tables.(I am using rails which comes with default ids)

DB and query details are as follows: (Sqlfiddle link for same: clickhere . Please don't modify in same sqlfiddle page)

Database 1:

 create table TASKS(task_code VARCHAR(20), task_name VARCHAR(20), project_name VARCHAR(20),dept_code VARCHAR(20));

 insert into TASKS(task_code, task_name, project_name, dept_code) VALUES("task1", "task1", "project1", "dept1");
 insert into TASKS(task_code, task_name, project_name, dept_code) VALUES("task1", "task1", "project1", "dept2");
 insert into TASKS(task_code, task_name, project_name, dept_code) VALUES("task2", "task2", "project2", "dept1");
 insert into TASKS(task_code, task_name, project_name, dept_code) VALUES("task2", "task2", "project2", "dept3");
 insert into TASKS(task_code, task_name, project_name, dept_code) VALUES("task3", "task3", "project3", "dept2");
 insert into TASKS(task_code, task_name, project_name, dept_code) VALUES("task3", "task3", "project3", "dept1");
 insert into TASKS(task_code, task_name, project_name, dept_code) VALUES("task4", "task4", "project4", "dept1");
 insert into TASKS(task_code, task_name, project_name, dept_code) VALUES("task4", "task4", "project4", "dept3");

Database 2:

 create table depts(dept_code VARCHAR(20), dept_name VARCHAR(20));
 create table project_tasks(task_code VARCHAR(20), task_name VARCHAR(20));
 create table dept_project_tasks(dept_code VARCHAR(20), task_code VARCHAR(20));

 insert into depts(dept_code, dept_name) values("dept1", "dept_one");
 insert into depts(dept_code, dept_name) values("dept2", "dept_two");

 insert into project_tasks(task_code, task_name) values("task1", "task1");
 insert into project_tasks(task_code, task_name) values("task2", "task2");
 insert into project_tasks(task_code, task_name) values("task3", "task3");

 insert into dept_project_tasks(dept_code, task_code) values("dept1", "task1");
 insert into dept_project_tasks(dept_code, task_code) values("dept2", "task1");
 insert into dept_project_tasks(dept_code, task_code) values("dept1", "task2");
 insert into dept_project_tasks(dept_code, task_code) values("dept3", "task2");

Query is:

 SELECT distinct task_code 
 from TASKS as TS 
 where TS.dept_code="dept1"  
   AND TS.task_code NOT IN (SELECT `project_tasks`.task_code 
                            FROM `project_tasks` 
                            INNER JOIN `dept_project_tasks` ON `project_tasks`.task_code = `dept_project_tasks`.task_code 
                            WHERE `dept_project_tasks`.dept_code = "dept1"
                           );

Thanks in advance

¿Fue útil?

Solución

Use LEFT JOIN like so:

SELECT distinct ts.task_code t1
from TASKS as TS 
LEFT JOIN `project_tasks` pt ON ts.task_code = pt.task_code
LEFT JOIN `dept_project_tasks` dpt ON pt.task_code = dpt.task_code 
WHERE dpt.task_code IS NULL;

Updated SQL Fiddle Demo

Otros consejos

Firstly, You don't need to bring in project_tasks, provided there is an enforced foreign key in place

SELECT distinct task_code 
from TASKS as TS 
where TS.dept_code="dept1" 
AND    TS.task_code NOT IN (
    SELECT `dept_project_tasks`.task_code 
    FROM `dept_project_tasks` WHERE `dept_project_tasks`.dept_code = "dept1"
);

Secondly, you need an index on dept_project_tasks.dept_code to include task_code.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top