Question

I just had an issue with Django and PostgreSQL that I don't understand.

I have a simple model, defined such as:

class MyModel(models.Model):
    my_field = models.IntegerField()
    my_other_field = models.TextField()

In my view, i have something similar to:

my_object = MyModel(my_field=1, my_other_field='blah')
my_object.save()

Everything was working fine, until this morning. I got this error:

 IntegrityError at /my_url/

duplicate key value violates unique constraint "my_model_pkey"
DETAIL:  Key (id)=(3) already exists.
CONTEXT:  Remote SQL command: INSERT INTO public.my_model(id, my_field, my_other_field) VALUES ($1, $2, $3) RETURNING id

I had this error once, I know it is related to the way PostgreSQL syncs the sequential table associated with my model with the id column. I has to run this function in PostgreSQL until the id returned was greater than the biggest value of the id.

select nextval('my_model_id_seq'::regclass);

My question is: Why did this happen in the first place? And how to prevent it in the future ?

By the way, that's the only way I insert data into the table, I've never inserted data manually.

I hope the question is clear enough

Was it helpful?

Solution

I think the question is not "why is my sequence getting messed up" - rather it is "why is Django trying to supply a value for the id column when inserting a row, instead of allowing the database to insert the next value in the sequence".

The Django documentation describes the algorithm it uses to decide whether it should be doing an UPDATE or an INSERT when you call save().

This algorithm involves checking if the 'id' field of the object is already set to some value. If it is not, then it does an INSERT (presumably not specifying a value for the 'id' field). If it is set, then it first tries to do an UPDATE; if that does not result in an updated record, then it will do an INSERT (this time presumably it would specify a value for the 'id' field).

As pointed out in Erwin's answer, the error message which you seeing indicates it is trying to insert a row while specifying the value for the 'id' field.

I note that it appears this algorithm has changed in version 1.6 of Django. Previously it used a SELECT first to see if a record existed, then an UPDATE if it did or an INSERT if it did not. If your problem has started occurring since upgrading, then that could be a cause. The documentation notes:

There are some rare cases where the database doesn’t report that a row was updated even if the database contains a row for the object’s primary key value. An example is the PostgreSQL ON UPDATE trigger which returns NULL. In such cases it is possible to revert to the old algorithm by setting the select_on_save option to True.

If this were happening for you, then it would explain your symptoms: the error would actually be occurring when trying to update a value in the database, and django would erroneously think that the row did not exist and then try to create it.

You could check for this by setting 'select_on_save' to true to revert to the old behavior.

Another possible reason for this would be if your code inadvertently set the 'id' attribute on an object to some value, and then called save(). This could cause various problems, depending on whether the value already existed in the database or not. In particular, it might result in creating a row which has an 'id' value which is ahead of the current range of the sequence associated with the column, so that later on you would get errors trying to insert into the row.

Another possible reason could be using the 'force_insert' argument to save(), on a row which had previously loaded from the database (so that it was actually an existing row you should be updating).

OTHER TIPS

The root of the problem lies here (SQL command from your error message):

INSERT INTO public.my_model(id, my_field, my_other_field)
VALUES ($1, $2, $3)
RETURNING id

Since your id column seems to be a serial type, do not insert values manually. Let the default draw from the sequence automatically. Should be:

INSERT INTO public.my_model(my_field, my_other_field)
VALUES ($1, $2)
RETURNING id;

That's the whole point of adding RETURNING id to begin with: to return the newly generated id. If you pass in a value yourself, you wouldn't need to have it returned.

Fix

If the sequence got out of sync somehow, because manual entries conflict with the numbers from nextval(), run this query once:

SELECT setval('my_model_id_seq', max(id)) FROM my_model;

This sets the sequence to the current maximum. Next call is next number, no off-by-one error.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top