Fast image segmentation algorithms for automatic object extraction

https://stackoverflow.com/questions/23655259

22-07-2023
|

Вопрос

I need to segment a set of unknown objects (books, cans, toys, boxes, etc.) standing on top of a surface (table top, floor…). I want to extract a mask (either binary or probabilistic) for each object on the scene.

I do know what the appearance of the surface is (a color model). The size, geometry, amount, appearance of the objects is arbitrary, and they could be texture-less as well). Multiple views might be available as well. No user interaction is available.

I have been struggling on picking the best kind of algorithm for this scenario (graph based, cluster based, super-pixels, etc.). This comes, naturally from a lack of experience with different methods. I'd like to know how they compare one to another.

I have some constraints:

Can’t use libraries (it’s a legal constraint, except for OpenCV). So any algorithm must be implemented by me. So I’d like to choose an algorithm that is simple enough to be implemented in a non-too-long period of time.
Performance is VERY important. There will be many other processes running at the same time, so I can’t afford to have a slow method.

It’s much preferred to have a fast and simple method with less resolution than something complex and slow that provides better results.

Any suggestion on some approach suitable for this scenario would be appreciated.

Решение

For speed, I'd quickly segment the image into surface and non-surface (stuff). So this at least gets you from 24 bits (color?) to 8 bits or even one bit, if you are brave.

From there you need to aggregate and filter the regions into blobs of appropriate size. For that, I'd try a morphological (or linear) filter that is keyed to a disk that would just fit inside the smallest object of interest. This would be an opening and a closing. Perhaps starting with smaller radii for better results.

From there, you should have an image of blobs that can be found and discriminated. Each blob or region should designate the objects of interest.

Note that if you can get to a 1-bit image, things can go VERY fast. However, efficient tools that can make use of this data form (8 pixels per character) are often not forthcoming.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow