How to monitor linux spinlock waiting time?

Question

Let me first explain how the spinlock code works. We have variables

uint16_t inc = 0x0100,
         lock->slock;     // I'll just call this "slock"

In the assembler code, inc is referred to as %0 and slock as %1. Moreover, %b0 denotes the lower 8 bit, i.e. inc % 0x100, and %h0 is inc / 0x100.

Now:

lock xaddw %w0, %1    ;; "inc := slock"  and  "slock := inc + slock"
                      ;; simultaneously (atomic exchange and increment)
1:
    cmpb %h0, %b0     ;; "if (inc / 256 == inc % 256)"
    je 2f             ;; "    goto 2;"
    rep ; nop         ;; "yield();"
    movb %1, %b0      ;; "inc = slock;"
    jmp 1b            ;; "goto 1;"
2:

Comparing the upper and lower byte of inc succeeds if inc is zero. Since inc has the value of the original lock, this happens if the lock is unlocked. In that case, the lock will already have been incremented to non-zero by the atomic exchange-and-increment, so it is now locked.

Otherwise, i.e. if the lock had already been locked, we pause a little, then update inc to the current value of the lock, and try again.

(I believe there's actually a possiblity for an overflow, if 2⁸ threads simultaneously attempt to get the spinlock. In that case, slock is updated to 0x0100, 0x0200, ... 0xFF00, 0x0000, and would then appear to be unlocked. Maybe that's why the second version of the code uses a 16-bit wide counter, which would require 2¹⁶ simultaneous attempts.)

Now let's insert a counter:

uint32_t spincounter = 0;

asm volatile( /* code below */
    : "+Q" (inc), "+m" (lock->slock)
    : "=r" (spincounter)
    : "memory", "cc");

Now spincounter may be referred to as %2. We just need to increment the counter each time:

1:
    inc %2
    cmpb %h0, %b0
    ;; etc etc

I haven't tested this, but that's the general idea.