Question

I just found out that a table on my production server (which holds approx. 35K records) contains 588 duplicate entries in an INT(11) column which has AUTO_INCREMENT. The UNIQUE key is missing on that column so that's probably the cause.

Any ideas on how to give all duplicate entries a unique ID and then adding the UNIQUE key to the column so this will never happen again?

Table schema:

CREATE TABLE `items` (
 `item_ID` int(11) unsigned NOT NULL auto_increment,
 `u_ID` int(10) NOT NULL default '0',
 `user_ID` int(11) NOT NULL default '0',
 `p_ID` tinyint(4) NOT NULL default '0',
 `url` varchar(255) NOT NULL,
 `used` int(10) unsigned NOT NULL,
 `sort` tinyint(4) NOT NULL,
 `last_checked` int(11) NOT NULL,
 `unixtime` int(11) NOT NULL,
 `switched` int(11) NOT NULL,
 `active` tinyint(1) NOT NULL default '0',
 UNIQUE KEY `unique` (`p_ID`,`url`),
 KEY `index` (`u_ID`,`item_ID`,`sort`,`active`),
 KEY `index2` (`u_ID`,`switched`,`active`),
 KEY `item_ID` (`item_ID`),
 KEY `p_ID` (`p_ID`),
 KEY `u_ID` (`u_ID`)
) ENGINE=MyISAM AUTO_INCREMENT=42755 DEFAULT CHARSET=utf8
Was it helpful?

Solution

How about something like this? Again test it on a backup first.

# Copy duplicate records
CREATE TABLE newitem SELECT * FROM items WHERE item_ID IN 
    (SELECT item_ID FROM itemd GROUP BY item_ID HAVING COUNT(*) > 1);

# remove auto increment from id in new table
ALTER TABLE newitem DROP INDEX Item_ID, MODIFY item_ID int;

# delete duplicates from original
DELETE FROM item WHERE item_ID IN (SELECT DISTINCT item_ID FROM newitem);

#Update column to be primary key
ALTER TABLE items DROP INDEX Item_ID, ADD PRIMARY KEY (Item_ID);

# Set new duplicate ID's to null
UPDATE newitem SET item_ID=NULL;

# Insert records back into old table
INSERT INTO item SELECT * FROM newitem;

# Get rid of work table
DROP newitem;

OTHER TIPS

Since you already have a UNIQUE key on the table, you could use this to make an UPDATE statement that re-assigns unique ids to item_id:

UPDATE
        items AS it
    JOIN
        ( SELECT 
              i.p_ID, i.url, @id:= @id+1 AS id
          FROM 
                  items AS i  
              CROSS JOIN 
                  ( SELECT @id:=0 ) AS dummy
          ORDER BY
              i.p_ID, i.url
        ) AS unq
      ON 
      (unq.p_ID, unq.url) = (it.p_ID, it.url)
SET 
    it.item_id = unq.id ;

Then you can add a unique index on item_id

Interesting. You have an auto_increment without a Primary Key reference, just an index, this is why you have dupes in the first place. If you try to update and assign primary key (item_ID) MySQL will complain because of the dupes in the item_ID column.

Your engine is MyISAM which means you don't have any FK constraints, so you could do a mysqldump of the table, truncate the table, update the schema, then re-import the data. Upon re-import MySQL should correctly insert all rows with truly unique Item_Ids.

I'll outline the steps here, but I strongly suggest you do this in a dev environment to confirm the steps work correctly, before applying to your production environment. I accept no responsibility for borked production data :)

$ mysqldump -u <user_name> -h <db_host> --opt <database_name> --single-transaction > backup.sql

mysql> truncate table `items`;

mysql> ALTER TABLE `items` DROP INDEX `Item_ID`, ADD PRIMARY KEY (`item_ID`), AUTO_INCREMENT = 1;

$ vi backup.sql # Remove the AUTO_INCREMENT reference from the Create Table syntax

$ mysql -h <host_name> <db_name> -u <username> -p < backup.sql    

Give that a shot, these steps are un-tested but should set you down the right path.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top