Question

I'm working on a social network, like a subset of Facebook. I think this means that the application will be more read-heavy than write-heavy (i.e. more SELECTS than INSERTS, UPDATES, OR DELETES)

I'm planning to use MySQL for the database, using MyISAM. Each table in the database will contain the following three fields:

  • CREATED - a date field containing the time the record was created
  • UPDATED - a date field containing the time the record was modified
  • ROWSTATUS - a CHAR(1) field containing a single character flag to show if the record is Active, Inactive, or Deleted (using values 'A', I, and D, respectively).

Through a PHP wrapper class, we ensure that all SELECT queries include the ROWSTATUS, and UPDATE queries also update the UPDATED column and INSERT queries update the CREATED column.

I'm planning not to actually delete any records, opting instead to update that records ROWSTATUS field to D to show it is deleted (i.e. a soft delete).

We have a SQL procedure which physically deletes the deleted data after 10 days.

However, I was going through this article, which argues that there is no need to delete physically because of locking overheads. Rather, the author suggested using this scheme:

SELECT e.eventid,e.title
    FROM events e
   WHERE NOT EXISTS
    (SELECT * FROM event_deletes ed WHERE ed.eventid = e.eventid);

I'm wondering how my scheme compares with this proposed mechanism, and which one is better? I couldn't reach to any definitive answer on my own.

Was it helpful?

Solution

As @Pentium10 says, there isn't anything intrinsically wrong with your plan. It's actually a fairly standard approach.

The problem is just that if you're using MyISAM your UPDATEs will cause the entire table to lock while the query runs. That introduces a bottleneck because you can only update or delete a single record at a time.

Unless you have a reason for using MyISAM, I'd recommend switching to InnoDB as your database engine. InnoDB uses row-level locking, so your UPDATE queries won't block other UPDATEs. It also has some other nice features such as support for transactions and referential integrity constraints.

OTHER TIPS

The only problem I see here, compared to that article, that you are handling only locks for the DELETE call.

You should know that UPDATE and DELETE statements always needs to issue an exclusive lock on a MyISAM table.

That's why the article recommends using an INSERT instead of a UPDATE rowstatus. You should go as the article says. Create a dedicated table to store the deleted ids, and use the recommended join on the selects to retrieve non deleted records. This way on end-user's delete action, you will just insert into the table, and you won't cause UPDATE lock on the table. If you add the proper keys to both of the tables, the join will be done only on the indexes, si it will be fast on the SELECT.

You are also introducing overhead if you store the time of the update. You should drop that idea, as has no use, and you won't use to tell when a record was updated.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top