Pergunta

I have data that i should insert into table. But before inserting i need to check against duplicate records and report list of those records.

Table:

CREATE TABLE `test` (
  `A` varchar(19) NOT NULL,
  `B` varchar(9) NOT NULL,
  KEY `A` (`A`),
  KEY `B` (`B`)
) ENGINE=InnoDB;

I need to check for both columns:

Number of records to insert: ~1000

Rows in table: ~1.000.000

What is the efficient way of doing this.

Thanks in advance.

Foi útil?

Solução

This would depend on the table's layout.

Suppose you have the following table

CREATE TABLE `mydata` ( 
    `A` varchar(19) NOT NULL, 
    `B` varchar(9) NOT NULL, 
    KEY `A` (`A`), 
    KEY `B` (`B`) 
) ENGINE=InnoDB; 

Before you insert 1000 rows into mydata, you could do preload them into another table called mynewdata like this:

CREATE TABLE mynewdata LIKE mydata;
CREATE TABLE mynewdups LIKE mydata;
INSERT INTO mynewdata ... ;
INSERT INTO mynewdups SELECT * FROM mynewdata;

Next delete all rows in mynewdata that matches A or B in mydata

DELETE T1.* FROM mynewdata T1 INNER JOIN mydata T2 ON T1.A=T2.A OR T1.B=T2.B;

What's left in mydata are rows that do not have A or B matching

What about the rows that matched? Run this

DELETE T1.* FROM mynewdups T1 LEFT JOIN mydata T2
ON T1.A=T2.A OR T1.B=T2.B
WHERE T2.A IS NOT NULL;

What's left in mynewdata is data to import

What's left in mynewdups is data that had a dup key in mydata

Give it a Try !!!

Licenciado em: CC-BY-SA com atribuição
Não afiliado a dba.stackexchange
scroll top