Grouping different scale bounding boxes

https://stackoverflow.com/questions/16005089

04-04-2022
|

Question

I've created an openCV application for human detection on images.

I run my algorithm on the same image over different scales, and when detections are made, at the end I have information about the bounding box position and at which scale it was taken from. Then I want to transform that rectangle to the original scale, given that position and size will vary.

I've wrapped my head around this and I've gotten nowhere. This should be rather simple, but at the moment I am clueless.

Help anyone?

Solution

Ok, got the answer elsewhere

"What you should do is store the scale where you are at for each detection. Then transforming should be rather easy right. Imagine you have the following.

X and Y coordinates (center of bounding box) at scale 1/2 of the original. This means that you should multiply with the inverse of the scale to get the location in the original, which would be 2X, 2Y (again for the bounxing box center).

So first transform the center of the bounding box, than calculate the width and height of your bounding box in the original, again by multiplying with the inverse. Then from the center, your box will be +-width_double/2 and +-height_double/2."

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow