I would think it is possible, using an instrumentation API like Intel's PIN (User Guide). For instance, you can start by instrumenting all instructions which perform atomic updates with INS_IsAtomicUpdate
and add further exclusion criteria to heuristically locate the instructions generated by __sync_fetch_and_add
.
Alternatively, you can install a series of NOPs with an asm volatile
block before each __sync_fetch_and_add
, look specifically for that instruction sequence, and instrument the machine code that follows (which is bound to be the code generated for __sync_fetch_and_add
).