Question

I've got a table containing many rows. The rows are guaranteed to have been inserted in order of a column called created_on, which is a datetime column. If a given row has a created_on time within 5 seconds of an existing row, I'd like to delete the given row.

How would I write a query to delete those rows?

Was it helpful?

Solution

SELECT *
FROM TABLE AS A
WHERE EXISTS (
    SELECT *
    FROM TABLE AS B
    WHERE DATE_SUB(A.created_on, INTERVAL 5 SECOND) <= B.created_on
        AND B.created_on < A.created_on
)

You understand that this will basically delete all chains of events within 5 seconds of each other, except for the first event in the chain.

Because you cannot alias a table in a DELETE, you'll have to do something like this:

DELETE
FROM   so902859
WHERE   created_on IN (
        SELECT  created_on
        FROM   so902859 AS A
        WHERE   EXISTS ( SELECT *
                         FROM  so902859 AS B
                         WHERE  DATE_SUB(A.created_on, INTERVAL 5 SECOND) <= B.created_on
                                AND B.created_on < A.created_on ) )

There are a million ways to skins this cat using JOINs or whatever. I think this one is the most clearly understandable, if a bit lengthy.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top