Two processes saving a record in database at the same time

https://stackoverflow.com/questions/10447030

05-06-2021
|

Domanda

What happens when more than one user inserts data in Database (MySQL, Postgres) at exactly same time? How does it prioritize which record to be inserted first and which one later. If the answer is specific to application of program, I am asking in reference to web-applications.

Soluzione

It's never exactly the same time. One will happen before the other. Which one will, unless you implement your own prioritisation mechanism, is indeterminate, and you should never rely on it.

As to what will happen, well that depends.

For two inserts to the same table, if data integrity is dependant on what order they are executed in your database design has a horrendous flaw.

For collisions (two updates to the same record for instance). There are two implementations.

Pessimistic locking. Assume there will be a significant number of updates to teh same data, so issue a lock around it. If The lock exists fail the update (e.g. second one if first hasn't finished) with some suitable message.

Optimistic locking. Assume collisions will rarely happen. Usual way of doing this is to add a timestamp field to the record which changes every update. So when you read the data you get the timestamp, and when you write the data you only do it, if the timestamp you have matches the one that's there now, and update said timestamp as part of it. If it does not match you do the "Someone else has changed this data message".

There is a compromise position, where you try and merge two updates. (for instance you change name and I change address). You need to really think about that though, it's messy, and get very complicated very quickly, and getting it wrong run's a real risk of messing up the data.

People with far larger IQs than mine spend a lot of time on this stuff, personally I like to keep it like me, simple...

Altri suggerimenti

In general, two things never happen at exactly the same time. There's a queue of work and at some level one thing always happens before the other.

However, there are cases where an overall transaction may take multiple steps -- and if two of these kinds of transactions begin at nearly the same time, they may overlap in time. This can cause problems.

For example, imagine a person buys something in a shopping cart and the steps include both creating an order record for them and decrementing and inventory count. If two people begin this process at nearly the same time, they could both potentially buy the item before the inventory is decremented to show the item out of stock.

In cases where things like this can occur, postgres (and other modern databases) provide ways to restrict for programs to protect themselves. These include both transactions and locking.

With transactions (see postgres docs here), groups of statements are run as a single unit -- and if one of the later steps fails, all steps are 'rolled back'. (For example, if decrementing inventory isn't possible because the item is now out of stock, the order creation can be rolled back.)

With locking (see postgres docs here), tables (or even individual rows in a table) are locked so that any other process wanting to access them either waits or is timed out. This would prevent two processes from updating the same data at nearly the same time.

In general, the vast majority of applications don't require either of these approaches. Unless you're working in an environment such as at a bank where the tables involved contain financial transactions, you probably won't have to worry about it.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow