My question is theoretical. I have a database with e-mails. For each email I store the desired sending time (as an UNIX timestamp) and the contents of the e-mail (sender, receiver, subject, body, etc.). There's a large number of e-mails scheduled. This's how I wanted to send the e-mails so far: I would have a worker process or server which periodically queries the database for "overdue" e-mails based on the timestamps. Then it sends those e-mails, and in the end it deletes them from the DB.

I started to think about two things:

  • What if the worker dies when it has sent the e-mail but hasn't deleted it from the database? If I restart the worker, the e-mail will be sent again.
  • How do I do it if I have a really large number of e-mails and therefore I run multiple workers? I can mark an e-mail in the database as "being sent", but how do I re-initiate sending if the responsible worker dies? I mean I won't know if a worker has died or it's just so slow that it's still sending the messages. I'm assuming I cannot get notified about a worker has died, so I can't re-send the e-mails that it failed to send.

I know that e-mail sending is not a so serious thing like bank transactions, but I think there must be a good solution for this.

How is this used to be done?

有帮助吗?

解决方案

I would actually use a flag on each email record in the database:

Your worker (or multiples) update the oldest record with their unique worker ID (e.g. a PID or IP/PID combination).

Example for Oracle SQL:

update email set workerid = 'my-unqiue-worker-id' where emailid in (
  select emailid from email where 
    rownum <= 1 and
    duetime < sysdate and 
    workerid = null 
    order by duetime
)

This would just take 1 not yet processed record (ordered by duetime, which has to be in the past) and set the worker ID. This procedure would be synchronized by the normal database locking mechanism (so only one thread writes at the same time).

Then you select all records with:

select * from email where workerid = 'my-unique-worker-id'

which will be either 0 or 1 record. If it is 0, there is no due mail.

If you have finished sending the email you set the workerid = 'some-invalid-value' (or you use another flag-column to mark the progress. That way it doesn't get picked up by the next worker.

You probably won't be able to find out if the email really has been sent. If the worker dies after sending and before updating the record, there's not much you can do. To be a bit more self-sufficient the worker could create a process file locally (e.g. an empty file with the emailid as the file name. This could at least detect if the crash was just a database connection issue..

If the worker is started and before updating any record already finds a message, which has its ID as the workerid then I would raise an alert / error which should be handled manually (by checking the SMTP server log and manually updating the record).

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top