PHP Semaphore alternative?

https://stackoverflow.com/questions/11045275

14-06-2021
|

Question

I'm making a small online game, where (what do you know) there'll be multiple users accessing the same database. My host doesn't enable Semaphores and I cannot really afford for something else (I'm a highschool student), so I'm looking for an alternative.

After some research, I've come across a couple work arounds:

A) I made a couple functions to mimic Semaphores, which is as good as I need I think:

function SemaphoreWait($id) {
    $filename = SEMAPHORE_PATH . $id . '.txt';
    $handle = fopen($filename, 'w') or die("Error opening file.");
    if (flock($handle, LOCK_EX)) {
        //nothing...
    } else {
        die("Could not lock file.");
    }
    return $handle;
}

function SemaphoreSignal($handle) {
    fclose($handle);
}

(I know there is unnecessary code in the if-statement). What do you guys think? Obviously it's not perfect, but is the general practice okay? Are there any problems? It is the first time I'm working with concurrency and I usually like a low level language where it makes a little more sense.

Anyways, I can't think of any "logic flaws" with this, and my only concern is speed; I've read that flock is quite slow.

B) MySQL also provides "lock" abilities. How does this compare with flock? And I assume it blocks if one user's script locks the table and another one requests it? The problem I can think of is that this locks the entire table, but I only really need locks on individual users (so not everyone waits).

It seems like people are saying I do not need Semaphores to update a database. I found out you could use a simple increment query, but there are still a few examples I need to work out:

What if I need to check a value before committing to an action? Like lets say I need to check if a defender is strong enough before allowing an attack, like enough health. And if so, do the battle and take some damage without letting anyone else muck/mess with the data.

My understanding is that there are a couple queries each time, over some length of code (send the query, fetch the data, increment it, store it back in), and the database would not be smart enough to handle that? Or am I mistaken here? Sorry

Solution

The real problem here is not how to implement locks, it's how to forge serializability. The way you're taught to achieve serializability in a pre- or non-database world is through locks and semaphores. The basic idea is this:

 lock()
 modify a bunch of shared memory
 unlock()

That way, when you have two concurrent users, you can be certain that neither one of them is going to produce an invalid state. So your example scenario is one in which both players attack each other simultaneously and arrive at contradictory notions of who won. So you are concerned about this kind of interleaving:

  User A      User B
    |           |
    V           |
  attack!       |
    |           V
    |         attack!
    V           |
  read "wins"   |
    |           V
    |         read "wins"
    |           |
    V           |
  write "wins"  |
                V
              write "wins"

The problem is that interleaving the reads and writes like this causes user A's write to be overwritten, or some other problem. These kinds of problems are generally called race conditions because the two threads are effectively racing for the same resources and one of them will "win", the other will "lose" and the behavior is not what you want.

The solution with locks, semaphores or critical sections is to create a kind of bottleneck: only one task is in the critical section or bottleneck at a time, so this set of problems can't happen. Everybody trying to get through the bottleneck is waiting on whoever got their first to get through—they're blocking:

 User A       User B
   |            |
   V            |
 attack!        |
   |          attack!
   V            |  
 lock           V
   |          blocking
   V            .
 read "wins"    .
   |            .
   V            .
 write "wins"   .
   |            .
   V            .
 unlock         V
              lock
                |
                V
               ...

Another way of looking at this is that the read/write combination need to be treated as a single coherent unit that can't be interrupted. In other words, they need to be treated atomically, as an atomic unit. This is exactly what the A in ACID stands for, when people say a database is "ACID compliant." In the database, we don't (or at least, should pretend not to) have locks because we instead use transactions, that delineate an atomic unit, like so:

BEGIN;

SELECT ...
UPDATE ...

COMMIT;

Everything between BEGIN and COMMIT is expected to be treated as an atomic unit, so either it all goes or none of it does. It turns out that relying on the A is not enough for your particular use case, because there's no way your transactions can fail each other:

 User A     User B
   |          |
   V          |
 BEGIN        V
   |        BEGIN
   V          |
 SELECT ...   V
   |        SELECT ...
   V          |
 UPDATE       V
   |        UPDATE
   V          |
 COMMIT       V
            COMMIT

Especially if you write this properly, where instead of saying UPDATE players SET wins = 37 you say UPDATE players SET wins = wins + 1, the database has no reason to suspect that these updates can't be performed in parallel, especially if they are working on different rows. As a result, you're going to need to employ more database foo: you need to worry about consistency, the C in ACID.

We want to design your schema so that the database itself can discern if something invalid has occurred, because if it can, the database will prevent it. Note: now that we're in design territory, there are necessarily going to be lots of different ways to approach the problem. The one I present here may not be the best one or even a good one, but I hope it will illustrate the thought processes that need to go on to solve problems with relational databases.

So now we're concerned with integrity, that is, that your database is in a valid state before and after every transaction. If the database is handling data validity like this, you can write your transactions the naive way and the database itself will abort them if they attempt to do something unreasonable due to concurrency. This means we have a new responsibility: we have to find a way to make the database aware of your semantics so it can handle the verifying. In general, the simplest way to ensure validity is with primary and foreign key constraints—in other words, ensuring that rows are unique or that they definitely refer to rows in other tables. I'm going to show you the thought process and some alternatives for two scenarios in a game in the hopes that you'll be able to generalize from there.

The first scenario is kills. Suppose you don't want player 1 to be able to kill player 2 if player 2 is in the middle of killing player 1. This means you want kills to be atomic. I'd model this like so:

CREATE TABLE players (
  login VARCHAR, 
  -- password hashes, etc.
);

CREATE TABLE lives (
  login VARCHAR REFERENCES players,
  life INTEGER
);

CREATE TABLE alive (
  login VARCHAR,
  life INTEGER,
  PRIMARY KEY (login, life),
  FOREIGN KEY (login, life) REFERENCES lives
);

CREATE TABLE deaths (
  login VARCHAR REFERENCES players,
  life INTEGER,
  killed_by VARCHAR,
  killed_by_life INTEGER,
  PRIMARY KEY (login, life),
  FOREIGN KEY (killed_by, killed_by_life) REFERENCES lives
);

Now you can atomically create new lives:

BEGIN;

SELECT login, MAX(life)+1 FROM lives WHERE login = 'login';
INSERT INTO lives (login, life) VALUES ('login', 'new life #');    
INSERT INTO alive (login, life) VALUES ('login', 'new life #');

COMMIT;

And you can atomically kill:

BEGIN;

SELECT name, life FROM alive 
WHERE name = 'killer_name' AND life = 'life #';

SELECT name, life FROM alive
WHERE name = 'victim_name' AND life = 'life #';

-- if either of those SELECTs returned NULL, the victim 
-- or killer died in another transaction

INSERT INTO deaths (name, life, killed_by, killed_by_life)
VALUES ('victim', 'life #', 'killer', 'killer life #');
DELETE FROM alive WHERE name = 'victim' AND life = 'life #';

COMMIT;

Now you can be quite sure that these things are happening atomically, because the UNIQUE constraint implied by the PRIMARY KEY will prevent new death records from being created with the same user and same life, regardless of who the killer may be. You can also manually check that your constraints are being met, such as by issuing counting statements between steps and issuing a ROLLBACK if anything unexpected has happened. You can even bundle up these things into triggered check constraints and further into stored procedures to get really meticulous.

On to the second example: energy limiting maneuvers. The simplest thing to do is add a check constraint:

CREATE TABLE player (
  login VARCHAR PRIMARY KEY, -- etc.
  energy INTEGER,
  CONSTRAINT ensure_energy_is_positive CHECK(energy >= 0)
);

Now if the player tries to use their energy twice, you'll get a constraint violation in one of the sequences:

Player A #1     Player A #2
    |               |
    V               |
  spell             |
    |               V
    V             spell
  BEGIN             |
    |               |
    |               V
    |             BEGIN
    V               |
  UPDATE SET energy = energy - 5;
    |               |
    |               |
    |               V
    |             UPDATE SET energy = energy - 5;
    V               |
  [implied CHECK: pass]
    |               |
    V               |
  COMMIT            V
                  [implied CHECK: fail!]
                    |
                    V
                  ROLLBACK

There are other things you can do as well to convert this into a relational integrity problem rather than a check constraint, such as doling out energy in identified units and linking them to particular spell invocation instances, but it's probably too much work for what you're doing. This will work alright.

Now, I hate to have to say this after laying all that out there, but you will have to check your database and make sure that everything is set up for real ACID compliance. By default, MySQL used to ship with MyISAM for tables, which means you could BEGIN all day, and everything was being run individually anyway. If you make your tables with InnoDB as the engine instead, it works more-or-less as expected. I recommend you try PostgreSQL if you can, it is a bit more consistent about things like this out of the box. And of course commercial databases are also powerful. SQLite, on the other hand, has a write lock for the whole database, so if that's your backend the whole preceding is fairly moot. I don't recommend it for concurrent write scenarios.

In summary, the problem is that databases are fundamentally trying to handle this problem for you in a very high-level way. 99% of the time, you simply don't worry about it; the odds of two requests happening at exactly the right time are just not that high. All of the actions you'd care to take are going to happen so fast, odds of producing an actual race condition are vanishingly small. But it's good to worry about it. After all, you may be writing banking applications someday and it's important to know how to do things right. Unfortunately, in this particular case, the lock primitives you're used to are very primitive compared to what relational databases are trying to do. But databases are trying to optimize for speed and integrity, not simple familiar reasoning.

If this has been an interesting detour for you, I hope you'll check out one of Joe Celko's books or a databases text book for more information.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow