The mutex is probably faster in this case due to the fact that the same task is taking it over and over again with no other task getting involved. My guess is that mutex code is taking a shortcut to enable recursive mutex calls (i.e. the same task takes the same mutex twice). Even though your code is not technically a recursive mutex take, the code probably uses the same shortcut due to the fact that the semaphore owner was not overwritten by any other task taking the semaphore.
In other words you do:
1) semTake(semMutex)
2) ++global;
3) semGive(semMutex) // sem owner flag is not changed
4) sameTake(semMutex) // from same task as previous semTake
...
Then in step 4 the semTake sees that sem owner == current task id (because the sem owner was set in step 1 and never changed to anything else), so it just marks the semaphore as taken and quickly jumps out.
Of course this is a guess, a quick look at the source code and some vxworks shell breakpoints could confirm this, something I am unable to do because I no longer have access to vxworks.
Additionally look at the semMLib docs for some documentation on the recursive use of mutex.