We're showing approximately twice the number of SENDs/sec than we are seeing RECEIVEs/sec under our heaviest load.
I think this is the crux of the problem. The counter measures the statement execution rate, not the messages. This means that your RECEIVE receives probably only one or two messages on each result set. Because of conversation group locking RECEIVE is limited to retrieve only one conversation group on each result it returns. even if there are thousands of messages available in the queue, if they're all on separate conversations RECEIVE will return only one. Which usually results in poor performance and in symptoms just as you describe.
To achieve high throughput you'll have to somehow get the messages to belong to few conversations so that RECEIVE can yield a significant result set on the queues that have the problems. How to achieve this depends on the specifics of your business workflow.