سؤال

I am trying to implement a people counting system using computer vision for uni project. Currently, my method is:

  1. Background subtraction using MOG2
  2. Morphological filter to remove noise
  3. Track blob
  4. Count blob passing a specified region (a line)

The problem is if people come as group, my method only counts one people. From my readings, I believe this is what called as occlusion. Another problem is when people looks similar to background (use dark clothing and passing a black pillar/wall), the blob is separated while it is actually one person. enter image description here

From what I read, I should implement a detector + tracker (e.g. detect human using HOG). But my detection result is poor (e.g. 50% false positives with 50% hit rate; using OpenCV human detector and my own trained detector) so I am not convinced to use the detector as basis for tracking. Thanks for your answers and time for reading this post!

هل كانت مفيدة؟

المحلول 2

There is no single "good" answer to this as handling occlusion (and background substraction) are still open problems! There are several pointers that can be given that might help you along with your project.

You want to detect if a "blob" is one person or a group of people. There are several things you could do to handle this.

  • Use multiple cameras (it's unlikely that a group of people is detected as a single blob from all angles)
  • Try to detect parts of the human body. If you detect two heads on a single blob, there are multiple people. Same can be said for 3 legs, 5 shoulders, etc.

On the area of tracking a "lost" person (one walking behind another object), is to extrapolate it's position. You know that a person can only move so much in between frames. By holding this into account, you know that it's impossible for a user to be detected in the middle of your image and then suddenly disappear. After several frames of not seeing that person, you can discard the observation, as the person might have had enough time to move away.

نصائح أخرى

Tracking people in video surveillance sequences is still an open problem in the research community. However particule filters (PF) (aka sequential monte-carlo) gives good results towards occlusion and complex scene. You should read this. There is also extra links to example source code after biblio.

An advantage on using PF is the gain in computational time towards tracking by detection (only).

If you go this way, feel free to ask for better understanding about the maths behind the PF.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top