Check against multiple duplicate records
-
22-10-2019 - |
Question
I have data that i should insert into table. But before inserting i need to check against duplicate records and report list of those records.
Table:
CREATE TABLE `test` (
`A` varchar(19) NOT NULL,
`B` varchar(9) NOT NULL,
KEY `A` (`A`),
KEY `B` (`B`)
) ENGINE=InnoDB;
I need to check for both columns:
Number of records to insert: ~1000
Rows in table: ~1.000.000
What is the efficient way of doing this.
Thanks in advance.
Solution
This would depend on the table's layout.
Suppose you have the following table
CREATE TABLE `mydata` (
`A` varchar(19) NOT NULL,
`B` varchar(9) NOT NULL,
KEY `A` (`A`),
KEY `B` (`B`)
) ENGINE=InnoDB;
Before you insert 1000 rows into mydata, you could do preload them into another table called mynewdata like this:
CREATE TABLE mynewdata LIKE mydata;
CREATE TABLE mynewdups LIKE mydata;
INSERT INTO mynewdata ... ;
INSERT INTO mynewdups SELECT * FROM mynewdata;
Next delete all rows in mynewdata that matches A or B in mydata
DELETE T1.* FROM mynewdata T1 INNER JOIN mydata T2 ON T1.A=T2.A OR T1.B=T2.B;
What's left in mydata are rows that do not have A or B matching
What about the rows that matched? Run this
DELETE T1.* FROM mynewdups T1 LEFT JOIN mydata T2
ON T1.A=T2.A OR T1.B=T2.B
WHERE T2.A IS NOT NULL;
What's left in mynewdata is data to import
What's left in mynewdups is data that had a dup key in mydata
Give it a Try !!!