In Mini Batch Gradient Descent what happens to remaining examples

https://datascience.stackexchange.com/questions/81597

13-12-2020
|

Question

Suppose my dataset has 1000 samples (X=1000) . I choose batch size of 32.
As 1000 is not perfectly divisible by 32 , remainder is 8.

My question is what happens to the last 8 examples. Are they considered? If they are, will they effect the efficiency of my model.

def next_batch(X, y, batchSize):
     for i in np.arange(0, X.shape[0], batchSize):
           yield (X[i:i + batchSize], y[i:i + batchSize])

This code is from a book and according to me this code is not considering the last remaining data points

Solution

It's an implementation-dependent point but there is no reason that the last few records should be left.

In Keras - It takes the remaining data points as the last step.
Addition of one extra elemtn increases the steps by 1.

Case-I - Data count is divisible by batch_size

epochs = 1
batch_size = 16
history = model.fit(x_train.iloc[:864], y_train[:864], batch_size=batch_size, epochs=epochs)

54/54 [==============================] - 0s 3ms/step

Case-II - Adding an extra data point

epochs = 1
batch_size = 16
history = model.fit(x_train.iloc[:865], y_train[:865], batch_size=batch_size, epochs=epochs)

55/55 [==============================] - 0s 3ms/step -

In your example too, same thing is happening

batch_size = 16
np.arange(0, x_train.shape[0], batch_size)

.....672, 688, 704, 720, 736, 752, 768,832, 848, 864])

When the last slice will happen, it will be a batch of 11 datapoints

len(x_train[864:880])  # Although x_train end at 875

11

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange