std::mutex vs std::recursive_mutex as class member

Question 1

Most of the time, if you think you need a recursive mutex then your design is wrong, so it definitely should not be the default.

For a class with a single mutex protecting the data members, then the mutex should be locked in all the public member functions, and all the private member functions should assume the mutex is already locked.

If a public member function needs to call another public member function, then split the second one in two: a private implementation function that does the work, and a public member function that just locks the mutex and calls the private one. The first member function can then also call the implementation function without having to worry about recursive locking.

e.g.

class X {
    std::mutex m;
    int data;
    int const max=50;

    void increment_data() {
        if (data >= max)
            throw std::runtime_error("too big");
        ++data;
    }
public:
    X():data(0){}
    int fetch_count() {
        std::lock_guard<std::mutex> guard(m);
        return data;
    }
    void increase_count() {
        std::lock_guard<std::mutex> guard(m);
        increment_data();
    } 
    int increase_count_and_return() {
        std::lock_guard<std::mutex> guard(m);
        increment_data();
        return data;
    } 
};

This is of course a trivial contrived example, but the increment_data function is shared between two public member functions, each of which locks the mutex. In single-threaded code, it could be inlined into increase_count, and increase_count_and_return could call that, but we can't do that in multithreaded code.

This is just an application of good design principles: the public member functions take responsibility for locking the mutex, and delegate the responsibility for doing the work to the private member function.

This has the benefit that the public member functions only have to deal with being called when the class is in a consistent state: the mutex is unlocked, and once it is locked then all invariants hold. If you call public member functions from each other then they have to handle the case that the mutex is already locked, and that the invariants don't necessarily hold.

It also means that things like condition variable waits will work: if you pass a lock on a recursive mutex to a condition variable then (a) you need to use std::condition_variable_any because std::condition_variable won't work, and (b) only one level of lock is released, so you may still hold the lock, and thus deadlock because the thread that would trigger the predicate and do the notify cannot acquire the lock.

I struggle to think of a scenario where a recursive mutex is required.

Question 2

should std::recursive_mutex be default and std::mutex considered as an performance optimization?

Not really, no. The advantage of using non-recursive locks is not just a performance optimization, it means that your code is self-checking that leaf-level atomic operations really are leaf-level, they aren't calling something else that uses the lock.

There's a reasonably common situation where you have:

a function that implements some operation that needs to be serialized, so it takes the mutex and does it.
another function that implements a larger serialized operation, and wants to call the first function to do one step of it, while it is holding the lock for the larger operation.

For the sake of a concrete example, perhaps the first function atomically removes a node from a list, while the second function atomically removes two nodes from a list (and you never want another thread to see the list with only one of the two nodes taken out).

You don't need recursive mutexes for this. For example you could refactor the first function as a public function that takes the lock and calls a private function that does the operation "unsafely". The second function can then call the same private function.

However, sometimes it's convenient to use a recursive mutex instead. There's still an issue with this design: remove_two_nodes calls remove_one_node at a point where a class invariant doesn't hold (the second time it calls it, the list is in precisely the state we don't want to expose). But assuming we know that remove_one_node doesn't rely on that invariant this isn't a killer fault in the design, it's just that we've made our rules a little more complex than the ideal "all class invariants always hold whenever any public function is entered".

So, the trick is occasionally useful and I don't hate recursive mutexes to quite the extent that article does. I don't have the historical knowledge to argue that the reason for their inclusion in Posix is different from what the article says, "to demonstrate mutex attributes and thread extensons". I certainly don't consider them the default, though.

I think it's safe to say that if in your design you're uncertain whether you need a recursive lock or not, then your design is incomplete. You will later regret the fact that you're writing code and you don't know something so fundamentally important as whether the lock is allowed to be already held or not. So don't put in a recursive lock "just in case".

If you know that you need one, use one. If you know that you don't need one, then using a non-recursive lock isn't just an optimization, it's helping to enforce a constraint of the design. It's more useful for the second lock to fail, than for it to succeed and conceal the fact that you've accidentally done something that your design says should never happen. But if you follow your design, and never double-lock the mutex, then you'll never find out whether it's recursive or not, and so a recursive mutex isn't directly harmful.

This analogy might fail, but here's another way to look at it. Imagine you had a choice between two kinds of pointer: one that aborts the program with a stacktrace when you dereference a null pointer, and another one that returns 0 (or to extend it to more types: behaves as if the pointer refers to a value-initialized object). A non-recursive mutex is a bit like the one that aborts, and a recursive mutex is a bit like the one that returns 0. They both potentially have their uses -- people sometimes go to some lengths to implement a "quiet not-a-value" value. But in the case where your code is designed to never dereference a null pointer, you don't want to use by default the version that silently allows that to happen.

Question 3

I'm not going to directly weigh in on the mutex versus recursive_mutex debate, but I thought it would be good to share a scenario where recursive_mutex'es are absolutely critical to the design.

When working with Boost::asio, Boost::coroutine (and probably things like NT Fibers although I'm less familiar with them), it is absolutely essential that your mutexes be recursive even without the design problem of re-entrancy.

The reason is because the coroutine based approach by its very design will suspend execution inside a routine and then subsequently resume it. This means that two top level methods of a class might "be being called at the same time on the same thread" without any sub calls being made.