Question

I am pulling data from a url/JSON, decoding, and sending elements of the JSON to a sqlite table.

As I pull the JSON every other minute, sometimes I pull the same JSON (which hasn't yet been refreshed). However, I don't want to enter the same data more than once into the table. Any solution to this problem is helpful.

My thought was to simply include the 'time executed' element of the JSON as one of the fields being passed to sqlite. Therefore, if I use REPLACE instead of INSERT, I will insert new rows into the existing SQL table if and only if the JSON has a new timestamp. Here is what I mean:

json = json.loads(y)
jsontime = json['executionTime']

db = sqlite3.connect('database.db')
c = db.cursor()


c.execute("""CREATE TABLE IF NOT EXISTS cbdata (
    cb_id INTEGER PRIMARY KEY ASC, 
    tjson DATE,
    id INTEGER,
    Name TEXT,
    Age INTEGER);""")

for item in json['List']:
    i1 = item["id"]
    i2 = item["Name"]
    i3 = item["Age"]
    iall = [jsontime, i1, i2, i3]
    c.execute("REPLACE into cbdata values(NULL,?,?,?,?)", iall)

However, this isn't preventing duplicate rows from being entered. Every time the script runs, new (even if duplicate) entries are inserted into the table.

Thoughts? Other solutions? Thanks kindly.

Was it helpful?

Solution

In order for a REPLACE query to function correctly you need to specify which attributes can serve to uniquely identify a row, i.e., you need to add a UNIQUE constraint to the table. (In your case the db has no way of knowing which of the four non-primary key columns is intended to be unique to a row.) Thus you should add a uniqueness constraint on the timestamp, or, if you want to allow more than one id, Name, Age tuple for a given timestamp, add a multiple-column uniqueness constraint.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top