Looks like we can call the camera using a proper GStreamer pipeline like below:
VideoCapture cap("mfw_v4lsrc ! ffmpegcolorspace ! video/x-raw-rgb ! appsink")
as the camera output is in YUV, we need to convert that to RGB to pass the frames to OpenCV. This is where OpenCV makes sure it gets RGB colorspace.