Question

I'm trying to understand the role of data augmentation and how it can affect the performance/accuracy of a deep model. My target application is a fire classification (fire or not, on video frames), with almost 15K positive and negative samples, and I was using the following data augmentation techniques. Does using ALL the followings always increase the performance? Or we have to choose them somehow smartly given our target application?

rotation_range=20, width_shift_range=0.2, height_shift_range=0.2,zoom_range=0.2, horizontal_flip=True

When I think a bit more, fire is always straight up, so I think rotation or shift might in fact worsen the results, given that it makes the image sides stretch like this, which is irrelevant to fires in video frames. Same with rotation. So I think maybe I should only keep zoom_range=0.2, horizontal_flip=True and remove the first three. Because I see some false positives when we have a scene transition effect in videos.

Is my argument correct? Should I keep them or remove them?

enter image description here

enter image description here

Was it helpful?

Solution

Your reasoning is perfectly correct. Augmentation is just a process, which helps you cover your domain better. You should only pick operators that help you. Abusing augmentation can definitely mess up your model. It's always good idea to print data at those limits, to check yourself. Try also to think, how data will be acquired on production. Albumentations is a nice repo with a lot of available methods.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top