Question

I have a table with detail like this, person_item(id, person_id, item_name, value). For certain person_id and item_name, this should be unique, which mean if I run the sql scripts,

SELECT name,Count(*) FROM person_item GROUP BY person_id, item_name

All the select should be only one. However, I found duplicate data and for each person_id and item_name pair, sometimes it shows more than 1 row. What I would like to do is to remain the first row for each GROUP BY and delete the duplicate. Algorithm should like this,

1. FROM person_item GROUP BY person_id, item_name
2. if Count(*)>1, remain the first one and delete the rest

However, I don't know how to write such SQL scripts without creating new table. Thank you.

Was it helpful?

Solution

I tested the following on MySql and it served the purpose. two things you need to do

1.you have to introduce a unique row identifier - in your case I think id is for that purpose

2.you have to disable the MySql Safe Update in Edit > Preference > Sql Editor if your row id is not primary key

create table test2 (
rowid varchar(10),
id varchar(20),
person_id varchar(20),
item_name varchar(20),
value varchar(20));

insert into test2 
(rowid,id, person_id,item_name,value)
values ('1','1','1','first item','first value');

insert into test2
(rowid,id, person_id,item_name,value)
values ('2','1','1','first item','first value');

commit;

SELECT item_name,Count(*) FROM test2 GROUP BY person_id, item_name;


DELETE FROM Test2
 WHERE rowid NOT IN (SELECT * 
                    FROM (SELECT MAX(n.rowid)
                            FROM test2 n
                        GROUP BY n.person_id, n.item_name) x);

SELECT item_name,Count(*) FROM test2 GROUP BY person_id, item_name;

N.B. If you are a person creating the schema - probably you must not delete data - so before you try this make sure you do it on your play area.

OTHER TIPS

If you want to have a unique constraint, I'd suggest that you add that to the schema and have the database enforce it. You should not be in the position of having to write this query.

It sounds like this is a many to many JOIN table where the primary key should be (person_id, item_name). That would guarantee uniqueness.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top