Why can't I use data augmentation with a pretrained convnet?

https://datascience.stackexchange.com/questions/49069

01-11-2019
|

Question

Reading Deep Learning with Python by François Chollet. In section 5.3.1, we've instantiated a pretrained convnet, VGG16, and are given two options to proceed:

A) Running the convolutional base over your dataset, recording its output to a Numpy array on disk, and then using this data as input to a standalone, densely connected classifier similar to those you saw in part 1 of this book. This solution is fast and cheap to run, because it only requires running the convolutional base once for every input image, and the convolutional base is by far the most expensive part of the pipeline. But for the same reason, this technique won’t allow you to use data augmentation.

B) Extending the model you have (conv_base) by adding Dense layers on top, and running the whole thing end to end on the input data. This will allow you to use data augmentation, because every input image goes through the convolutional base every time it’s seen by the model. But for the same reason, this technique is far more expensive than the first.

Why can't I use data augmentation to generate more training data from existing training samples then go with option A? Seems like I can run the VGG16 base over my augmented dataset and use the output as the input to a standalone densely connected classifier.

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange