How to elmininate values in an Oracle-SQL-Table which mean the same?
Pergunta
I've got a table like this:
ID | Val1 | Val2
---------------------
1 | 1 | 2
2 | 1 | 3
3 | 2 | 1
4 | 2 | 3
5 | 3 | 1
6 | 3 | 2
now my problem is, that 1 - 2 means the same like 2 - 1 (look @ ID 1 and ID 3 for example) and I want to eliminate all entries where value 1 - value 2 means the same like value 2 - value1 (hope you could follow my logic in here).
Solução
How about this:
DELETE t
WHERE ID IN
(SELECT t1.id
FROM t t1 JOIN t t2
ON (t1.val1 = t2.val2 AND
t1.val2 = t2.val1 AND
t1.id < t2.id));
I arbitrarily kept the row with the greatest ID value.
Example:
SQL> CREATE TABLE t (ID INTEGER, val1 INTEGER, val2 INTEGER);
Table created
SQL> INSERT INTO t VALUES(1,1,2);
1 row inserted
SQL> INSERT INTO t VALUES(2,1,3);
1 row inserted
SQL> INSERT INTO t VALUES(3,2,1);
1 row inserted
SQL> INSERT INTO t VALUES(4,2,3);
1 row inserted
SQL> INSERT INTO t VALUES(5,3,1);
1 row inserted
SQL> INSERT INTO t VALUES(6,3,2);
1 row inserted
SQL> INSERT INTO t VALUES(7,4,4);
1 row inserted
SQL> INSERT INTO t VALUES(8,4,4);
1 row inserted
SQL> SELECT * FROM t;
ID VAL VAL
--- --- ---
1 1 2
2 1 3
3 2 1
4 2 3
5 3 1
6 3 2
7 4 4
8 4 4
8 rows selected
SQL> DELETE t
2 WHERE ID IN (SELECT t1.id
3 FROM t t1 JOIN t t2 ON (t1.val1 = t2.val2 AND t1.val2 = t2.val1 AND t1.id < t2.id));
4 rows deleted
SQL> SELECT * FROM t;
ID VAL VAL
--- --- ---
3 2 1
5 3 1
6 3 2
8 4 4
SQL>
Easily adaptable to keep different rows, e.g.,
DELETE t
WHERE ID IN
(SELECT t1.id
FROM t t1 JOIN t t2
ON (t1.val1 = t2.val2 AND
t1.val2 = t2.val1 AND
(t2.val1 < t1.val1 OR (t2.val1 = t1.val1 AND t2.id > t1.id))));
UPDATE: Couldn't think of a really clever way, so here's the brute force method to answer the question in your comment:
CREATE TABLE t (ID INTEGER, val1 INTEGER, val2 INTEGER, val3 INTEGER);
INSERT INTO t VALUES (1, 1, 2, 3);
INSERT INTO t VALUES (2, 1, 3, 2);
INSERT INTO t VALUES (3, 2, 1, 3);
INSERT INTO t VALUES (4, 2, 3, 1);
INSERT INTO t VALUES (5, 3, 1, 2);
INSERT INTO t VALUES (6, 3, 2, 1);
INSERT INTO t VALUES (7, 1, 2, 4);
INSERT INTO t VALUES (8, 1, 3, 5);
INSERT INTO t VALUES (9, 1, 4, 2);
INSERT INTO t VALUES (10, 1, 1, 1);
INSERT INTO t VALUES (11, 1, 1, 1);
INSERT INTO t VALUES (12, 1, 3, 5);
SQL> select * from t order by id;
ID VAL VAL VAL
--- --- --- ---
1 1 2 3
2 1 3 2
3 2 1 3
4 2 3 1
5 3 1 2
6 3 2 1
7 1 2 4
8 1 3 5
9 1 4 2
10 1 1 1
11 1 1 1
12 1 3 5
12 rows selected
DELETE FROM t
WHERE ID IN (SELECT t1.ID FROM t t1 JOIN t t2 ON (t1.val1 = t2.val1 AND
t1.val2 = t2.val2 AND
t1.val3 = t2.val3 AND t1.id < t2.id)
UNION ALL
SELECT t1.ID FROM t t1 JOIN t t2 ON (t1.val1 = t2.val1 AND
t1.val2 = t2.val3 AND
t1.val3 = t2.val2 AND t1.id < t2.id)
UNION ALL
SELECT t1.ID FROM t t1 JOIN t t2 ON (t1.val1 = t2.val2 AND
t1.val2 = t2.val1 AND
t1.val3 = t2.val3 AND t1.id < t2.id)
UNION ALL
SELECT t1.ID FROM t t1 JOIN t t2 ON (t1.val1 = t2.val2 AND
t1.val2 = t2.val3 AND
t1.val3 = t2.val1 AND t1.id < t2.id)
UNION ALL
SELECT t1.ID FROM t t1 JOIN t t2 ON (t1.val1 = t2.val3 AND
t1.val2 = t2.val1 AND
t1.val3 = t2.val2 AND t1.id < t2.id)
UNION ALL
SELECT t1.ID FROM t t1 JOIN t t2 ON (t1.val1 = t2.val3 AND
t1.val2 = t2.val2 AND
t1.val3 = t2.val1 AND t1.id < t2.id));
select * from t order by id;
ID VAL VAL VAL
--- --- --- ---
6 3 2 1
9 1 4 2
11 1 1 1
12 1 3 5
Outras dicas
I don't recall if this syntax is valid in Oracle or not (mostly the use of an alias for the DELETE subject), but you can try this:
DELETE
T1
FROM
My_Table T1
INNER JOIN My_Table T2 ON
T2.val1 = T1.val2 AND
T2.val2 = T1.val2 AND
WHERE
T1.val1 < T1.val2
Since the ordering of the columns doesn't seem to matter, I would make an arbitrary decision and I would add a constraint to the table to check that val1 < val2. You can then put a unique constraint on the combination of the two columns (if you don't already have one) and be sure that you won't have this problem again.
Of course, you would also need to make sure that any application or code that inserts rows into the table knows that convention (val1 should always be the smallest of the two values) and follows it.
If eliminate means not show, try this, this won't return any rows for those conditions
select * from YourTable t1
where not exists (select * from YourTable t2
where t1.Val1 = t2.Val2
and t1.Val2 = t2.Val1)
delete from YourTable
where (Val1, Val2)
in (select Val2, Val1 from YourTable where Val1 > Val2)
There is a corner case that isn't handled well here. That is the case where Val1 and Val2 are equal. Deleting all but one occurrence of such rows is a bit trickier. Anyone have any ideas?