Are all race conditions worth fixing? [closed]

https://softwareengineering.stackexchange.com/questions/231199

02-10-2020
|

문제

I just wrote the following piece of code (in delphi):

procedure Update(Value: Integer);
begin
  // If the last update was yesterday, replace yesterday value
  if CompareDate(FLastUpdate,Now) <> 0 then
  begin
    FYesterdayValue := FTodayValue;
  end;
  FTodayValue := Value;
  FLastUpdate := Now;
end;

And I realized that it contains a small, but real race condition. Imagine that the first time the "Now" function is called it's one millisecond before midnight, the if-statement is skipped and the timestamp is updated with another call to "Now" which is now one millisecond AFTER midnight. Now the timestamp has the wrong value and the FYesterdayValue will not get the correct value.

In practise, I think the chances of this ever happening is almost non existent because a) the code is typically called ~1 / minute and b) the cpu time between the two call will be EXTREMELY small.

However it's a bug and I'm curious if you would fix it if you stumbled upon it in a project.

해결책

As a general rule, problems are worth fixing when the expected benefit exceeds the expected cost. Thread safety is no exception from this rule, it's just that the way of calculating the risk that determines the expected cost is particularly complex and ill-understood by many.

To begin with, threading safety is not on many people's radar. They will simply assume "This is never called concurrently, nothing can go wrong". And often it isn't. Other times it is (e.g. the CPU stalled for an unexpectedly large time because the entire app now lives in an oversubscribed virtual machine...), and then nobody has any clue what could possibly have happened. In other words, the probability of a race actually occurring is often very hard to estimate.

The impact is often easier to understand: if you're logging some informative stuff to an internal audit log, it tends to be low; if the date that you're maintaining is related to business logic (perhaps it determines when the subscription licence of your customer expires?), then it might be catastrophically bad.

Without knowing the context of this code, I can't predict what should be done. But almost certainly it isn't a good idea to notice a possible race condition and do nothing about it by just convincing yourself that it won't be a problem. First, as I said, it is particularly hard to predict whether it will be a problem or not. Second, if the condition were probable, you would have to deal with it after all, so you have to be capable of securing your code. And if you understand how to do that (usually it isn't that hard), there is comparatively little reason not to do it all the time as a matter of principle - just to avoid convincing yourself "Ah, I don't need to worry in this instance, this will never happen", which is a classic example of "Famous last words".

다른 팁

I would fix this bug, not just because of the race condition, but also because the version where you call Now once at the start and save it is much easier to reason about in general.

Basically, the more predictable a function is, the easier it is to understand. Now is, in a way, very unpredictable (it might return different values every time you call it), and thus every time you call it you make your own function less predictable and harder to understand.

As such, I really like Rory's suggestion that you refactor the call to Now out of the function completely. Now it's even more predictable, and as he points out, unit-testable.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 softwareengineering.stackexchange