Load data from multiple dataframes containing both path and labels in keras

https://datascience.stackexchange.com/questions/76398

12-12-2020
|

Question

I am aware that there exists a function in keras.preprocessing.image.ImageDataGenerator called flow_from_dataframe. But this function assumes that we have one dataframe containing all the paths to the images and labels associated with it. But what if we have multiple dataframes and we want to use these dataframes on a per-epoch basis so that in each epoch we can load a different dataframe and the labels and images associated with it? Thanks in advance.

Solution

I believe the training will continue if you don't recompile and we put dataset only in the last step.
So, something similar to this snippet should work

train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, horizontal_flip = True)

training_set_1 = train_datagen.flow_from_dataframe(dataframe=df, color_mode='grayscale',target_size = (28, 28,1), batch_size = 2)
training_set_2 = train_datagen.flow_from_dataframe(dataframe=df, color_mode='grayscale',target_size = (28, 28,1), batch_size = 2)

model = keras.models.Sequential([ ... ])

optimizer = keras.optimizers.SGD(lr=0.2, momentum=0.9, decay=0.01)
model.compile(loss="binary_crossentropy", optimizer=optimizer,
              metrics=["accuracy"])

for i in list(range(10)):
    history = model.fit(training_set_1,epochs=1)
    history = model.fit(training_set_2,epochs=1)

Definitely, learning can be very noisy based on the difference in variance in datagens.
Also,history will be reset everytime.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange