Get the 3D color histogram within the detected region of interest. That is, not three 1D histograms for each channel, but one 3D histogram for all 3 channels together. OpenCV's calcHist has options for that. Here's an example which does that. This example is using Python bindings for OpenCV, but it shows how to set the parameters.
Also, set the range of the histogram within a reasonable range for skin color. As MSalters suggested in the comments, HSV is a better color space for things like this. Perhaps you can disregard the S and V channels and only do the 1D histogram for V. Try is out.
The bin with the highest count is going to be your "average" skin color.