flock correct usage to prevent read while writing
Question
*/10 * * * * /usr/bin/flock -x -w 10 /tmp/craigslist.lock /usr/bin/lynx -width=120 -dump "http://sfbay.craigslist.org/search/roo/sfc?query=&srchType=A&minAsk=&maxAsk=1100&nh=6&nh=8&nh=16&nh=24&nh=17&nh=21&nh=22&nh=23&nh=27" | grep "sort by most recent" -A 53 > /home/winchell/apartments.txt
*/10 * * * * /usr/bin/flock -x -w 10 /tmp/craigslist.lock /usr/bin/php /home/winchell/apartments.php
This is a cron job. The second line php command seems to be executing even while lynx is writing to apartments.txt, and I don't see the reason. Is this correct usage assuming I'm trying to prevent read from apartments.txt while lynx/grep are writing to it? Thanks!
Solution
Your usage is not correct. Notice how your first cron job is a pipeline consisting of two commands:
/usr/bin/flock -x -w 10 /tmp/craigslist.lock /usr/bin/lynx -width=120 -dump
"http://sfbay.craigslist.org/search/roo/sfc?query=&srchType=A&minAsk=&maxAsk=1100&nh=6&nh=8&nh=16&nh=24&nh=17&nh=21&nh=22&nh=23&nh=27"
which is then piped to:
grep "sort by most recent" -A 53 > /home/winchell/apartments.txt
So the first command is locking a file but it's the second command that's writing to that file! The second command will happily execute without waiting for the lock.
One way to fix this would be to write the file while holding the lock:
lynx etc... | grep etc.. |
flock -x -w 10 /tmp/craigslist.lock tee /home/winchell/apartments.txt
The disadvantage of this approach is that lynx
and grep
run even if the file is locked. To prevent this, you will have to run the whole thing under the lock:
flock -x -w 10 /tmp/craigslock.lock sh -c "lynx etc... | grep etc... >thefile"
With this approach you will have to pay careful attention to quoting as the URL argument of lynx as it will require double quoting.
Finally: consider using curl
or wget
instead of lynx
. lynx
is meant for interactive usage!