Load data from multiple dataframes containing both path and labels in keras
-
12-12-2020 - |
Question
I am aware that there exists a function in keras.preprocessing.image.ImageDataGenerator
called flow_from_dataframe
. But this function assumes that we have one dataframe containing all the paths to the images and labels associated with it. But what if we have multiple dataframes and we want to use these dataframes on a per-epoch basis so that in each epoch we can load a different dataframe and the labels and images associated with it?
Thanks in advance.
Solution
I believe the training will continue if you don't recompile and we put dataset only in the last step.
So, something similar to this snippet should work
train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, horizontal_flip = True)
training_set_1 = train_datagen.flow_from_dataframe(dataframe=df, color_mode='grayscale',target_size = (28, 28,1), batch_size = 2)
training_set_2 = train_datagen.flow_from_dataframe(dataframe=df, color_mode='grayscale',target_size = (28, 28,1), batch_size = 2)
model = keras.models.Sequential([ ... ])
optimizer = keras.optimizers.SGD(lr=0.2, momentum=0.9, decay=0.01)
model.compile(loss="binary_crossentropy", optimizer=optimizer,
metrics=["accuracy"])
for i in list(range(10)):
history = model.fit(training_set_1,epochs=1)
history = model.fit(training_set_2,epochs=1)
Definitely, learning can be very noisy based on the difference in variance in datagens.
Also,history will be reset everytime.