Offsetting the start time of SQL Server Agent job schedules between instances

Question 1

If this is the same conversation we just had on twitter: when you tell SQL Server Agent to run every n minutes or every n hours, the next run is based on the start time, not the finish time. So if you set a job on instance 1 to run at 2:00 and run every 2 hours, the 2nd run will run at 4:00, whether the first run took 1 minute, 12 minutes, 45 minutes, etc.

There are some caveats:

there can be minor delays due to internal agent synchronization, but I've never seen this off by more than a few seconds
if the first run at 2:00 takes more than 2 hours (but less than 4 hours), the next time the job runs will be at 6:00 (the 4:00 run is skipped, it doesn't run at 4:10 or 4:20 to "catch up")

There was another suggestion to add a WAITFOR to offset the start time (and we should discard random WAITFOR, because that is probably not what you want - random <> unique). If you want to hard-code a different delay on each instance (1 minute, 2 minutes, etc.) then it is much more straightforward to do that with a schedule than by adding steps to all of your jobs. IMHO.

Question 2

Perhaps you could setup a centralized DB that manages the "schedule" and have the jobs add/update a row when they run. This way each subsequent server can start the job that "polls" when it can start. This way any latency in the jobs will cause the others to wait so you don't have a disparity in your timings when one of the servers is thrown off.

Being a little paranoid I'd add a catchall scenario that says after "x" minutes of waiting proceed anyway so that a delay doesn't cascade far enough that the jobs don't run.