Question

My data set has a total of 200 columns, where each column corresponds to the same pixel in all of my images. In total, I have 48,500 rows. The labels for the data range from 0-9.

The data looks something like this:

raw_0   raw_1   raw_2   raw_3   raw_4
0   120.0   133.0   96.0    155.0   66.0
1   159.0   167.0   163.0   185.0   160.0
2   45.0    239.0   66.0    252.0   NaN
3   126.0   239.0   137.0   NaN 120.0
4   226.0   222.0   153.0   235.0   171.0
5   169.0   81.0    100.0   44.0    104.0
6   154.0   145.0   76.0    134.0   175.0
7   77.0    35.0    105.0   108.0   112.0
8   104.0   55.0    113.0   90.0    107.0
9   97.0    253.0   255.0   251.0   141.0
10  224.0   227.0   84.0    214.0   57.0
11  NaN 13.0    51.0    50.0    NaN
12  82.0    213.0   61.0    98.0    59.0
13  NaN 40.0    84.0    7.0 39.0
14  129.0   103.0   65.0    159.0   NaN
15  123.0   128.0   116.0   198.0   111.0

Each column has around 5% missing values and I want to fill in these NaN values with something meaningful. However, I'm not sure how to go about this. Any suggestions would be welcome.

Thank you!

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top