How do I do a bulk mysql insert without incrementing the auto_increment column for duplicates?

StackOverflow https://stackoverflow.com/questions/13318396

  •  28-11-2021
  •  | 
  •  

سؤال

Here's my database schema

+------------------+------------------+------+-----+---------------------+----------------+
| Field            | Type             | Null | Key | Default             | Extra          |
+------------------+------------------+------+-----+---------------------+----------------+
| phone_number     | varchar(64)      | NO   | UNI | NULL                |                |
| id               | int(10) unsigned | NO   | PRI | NULL                | auto_increment |
+------------------+------------------+------+-----+---------------------+----------------+

I'd like to be able to insert several phone numbers (phone_number is a unique key) at once, but I don't want to increment the auto_increment field if I have duplicates.

If I do

INSERT INTO
   phone_numbers (phone_number) 
VALUES
   (
       % 
   VALUES
       % 
   )
   ON DUPLICATE KEY 
   UPDATE
      id = id;

the auto_increment will increase even for duplicates.

This question: Prevent auto increment on MySQL duplicate insert doesn't handle bulk inserts. I'd like to do something like this:

INSERT INTO
   phone_numbers (phone_number) 
   SELECT
      '12345',
      '123456' 
   FROM
      DUAL 
   WHERE
      NOT EXISTS
      (
         SELECT
            phone_number 
         FROM
            phone_numbers 
         WHERE
            phone_number IN 
            (
               '12345',
               '123456'
            );

but the DUAL table doesn't really handle multiple values well.

Any ideas? MySQL 5.5.

هل كانت مفيدة؟

المحلول

After reading the other article one way would be to have a temp table that you bulk insert into. Then select the rows from the temp table into our actual table. The duplicate rows would be removed at that point and the auto_increment field in the actual table would be correct.

CREATE TABLE PHONE_NUMBERS (id int(10) NOT NULL AUTO_INCREMENT, phone_number varchar(64), primary key (id), unique(phone_number) );

CREATE TEMPORARY TABLE TMP_PHONE_NUMBERS ( phone_number varchar(64), unique(phone_number) );

' bulk insert
INSERT INTO tmp_phone_numbers (phone_number) VALUES (%values%)

' remove phone numbers that already exist. This will create a unique
' set of phone numbers that do not exist in the real table.
DELETE FROM tmp_phone_numbers WHERE phone_number in (SELECT phone_number from phone_numbers);

' copy into real table
INSERT INTO phone_numbers (phone_number) SELECT phone_number FROM tmp_phone_numbers;

' Temp table is dropped when your connection is closed.

Here is another option:

If you know you won't hit int(10) in the first bulk load or any subsequent bulk loads you can use INSERT INTO phone_numbers (phone_number) VALUES (%values%) ON DUPLICATE KEY UPDATE id=id; which will leave gaps in the id field. But after you are done bulk loading, you can remove the ID column, then re-add it back which will recreate all of your id's with no gaps.

نصائح أخرى

Since INSERT IGNORE does increase the auto-increment counter on the table (even when making no changes to the data), a simple work-around would be to restore the auto-increment value at the end of the transaction by leveraging the engine logic that always chooses an integer greater than the presently highest AI value when trying to force a value that is too low:

ALTER TABLE phone_numbers AUTO_INCREMENT = 0;

Doing this does not require a table rebuild, and can be tested as follows:

SELECT MAX(id) FROM phone_numbers;
INSERT IGNORE INTO phone_numbers (phone_number) VALUES ("0123456789");
SELECT MAX(id) FROM phone_numbers;
INSERT IGNORE INTO phone_numbers (phone_number) VALUES ("0123456789");
SELECT MAX(id) FROM phone_numbers;
ALTER TABLE phone_numbers AUTO_INCREMENT = 1;
INSERT IGNORE INTO phone_numbers (phone_number) VALUES ("1000000000");
SELECT MAX(id) FROM phone_numbers;

+---------+
| MAX(id) |
+---------+
|    NULL |
+---------+
1 row in set (0.00 sec)

Query OK, 1 rows affected (0.00 sec)

+---------+
| MAX(id) |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

+---------+
| MAX(id) |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)

Query OK, 1 rows affected (0.33 sec)
Records: 1  Duplicates: 0  Warnings: 0

Query OK, 0 rows affected (0.00 sec)

+---------+
| MAX(id) |
+---------+
|       2 |
+---------+
1 row in set (0.00 sec)

Just use INSERT IGNORE. That will ignore any inserts that fail because of duplicate keys. The rest of the inserts will go through fine.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top