I am trying to use SQLite3 to form correlations between two tables that have similar data.
Here's what I have so far:
CREATE TABLE a (date TEXT, user TEXT, ip TEXT);
CREATE INDEX a_index ON a (date, user, ip);
CREATE TABLE b (date TEXT, ip TEXT);
CREATE UNIQUE INDEX b_index ON b (date, ip);
INSERT INTO a VALUES('2014-03-01 03:15:16', 'a', '127.0.0.1');
INSERT INTO a VALUES('2014-03-01 03:15:18', 'b', '127.0.0.2');
INSERT INTO a VALUES('2014-03-01 03:15:21', 'c', '127.0.0.3');
INSERT INTO a VALUES('2014-03-01 03:15:21', 'd', '127.0.0.4');
INSERT INTO a VALUES('2014-03-01 03:15:29', 'e', '127.0.0.5');
INSERT INTO a VALUES('2014-03-01 03:16:32', 'f', '127.0.0.6');
INSERT INTO b VALUES('2014-03-01 03:15:16', '127.0.0.1');
INSERT INTO b VALUES('2014-03-01 03:15:17', '127.0.0.1');
INSERT INTO b VALUES('2014-03-01 03:15:19', '127.0.0.1');
INSERT INTO b VALUES('2014-03-01 03:15:22', '127.0.0.4');
INSERT INTO b VALUES('2014-03-01 03:16:32', '127.0.0.5');
I know I could simply use an inner join to combine these two sets, like this:
SELECT *
FROM a
JOIN b ON a.ip = b.ip AND a.date = b.date;
and it would return
2014-03-01 03:15:16|a|127.0.0.1|2014-03-01 03:15:16|127.0.0.1
as expected. However, due to clock drift in time recording. I would like to match any possible entries +- 3 seconds from each other. In this case, I have used:
SELECT *
FROM a
JOIN b ON a.ip = b.ip AND a.date BETWEEN DATETIME(b.date, '-3 seconds') AND DATETIME(b.date, '+3 seconds');
This works, although it's returning more entries than I wanted. Instead of the following:
2014-03-01 03:15:16|a|127.0.0.1|2014-03-01 03:15:16|127.0.0.1
2014-03-01 03:15:16|a|127.0.0.1|2014-03-01 03:15:17|127.0.0.1
2014-03-01 03:15:16|a|127.0.0.1|2014-03-01 03:15:19|127.0.0.1
2014-03-01 03:15:21|d|127.0.0.4|2014-03-01 03:15:22|127.0.0.4
I am wondering if it's possible to return only one entry max per entry in the a table if a matching entry is found in the b table. So the expected result would look something like this:
2014-03-01 03:15:16|a|127.0.0.1|2014-03-01 03:15:16|127.0.0.1
2014-03-01 03:15:21|d|127.0.0.4|2014-03-01 03:15:22|127.0.0.4
How should / could this be accomplished?