"What is the function being differentiated? What is the "special case?""
The most important distinction between backpropagation and reverse-mode AD is that reverse-mode AD computes the vector-Jacobian product of a vector valued function from R^n -> R^m, while backpropagation computes the gradient of a scalar valued function from R^n -> R. Backpropagation is therefore a special case of reverse-mode AD for scalar functions.
When we train neural networks, we always have a scalar-valued loss function, so we are always using backpropagation. This is the function being differentiated. Since backprop is a subset of reverse-mode AD, then we are also using reverse-mode AD when we train a neural network.
"Is it the adjoint values themselves that are used or the final gradient?"
The adjoint of a variable is the gradient of the loss function with respect to that variable. When we do neural network training, we use the gradients of the parameters (like weights, biases, etc) with respect to the loss to update the parameters. So we do use the adjoints, but only the adjoints of the parameters (which are equivalent to the gradient of the parameters).