What kind of objects are you trying to detect? And what kind of negative images are you using? Ideally, the negative images should be large images of scenes typically associated with your objects of interest.
Edit: Even if you are providing 30K negative images, the training may still not have enough negative samples. The trainCascadeObjectDetector function generates negative samples for each stage, by running the detector consisting of the stages it has so far on the negative images. If the detector detects any objects, they are by construction false positives, and they are used as negative samples for the next stage. Depending on what kind of negative images you supply, it may well be possible that after some number of stages, the current detector does not detect any false positives in the negative images.
You have said that your negative images come from a video of your room. The problem may be that all your negative images are too similar to each other. So you should probably try including other images among your negative set, to diversify it. Also you should make sure to include images containing hand gestures other than the one you are training for among your negative images.