Question

I have the following dataframe in Python (the actual dataframe is much bigger, just presenting a small sample):

      A     B     C     D     E     F
0  0.43  0.52  0.96  1.17  1.17  2.85
1  0.43  0.52  1.17  2.72  2.75  2.94
2  0.43  0.53  1.48  2.85  2.83  
3  0.47  0.59  1.58        3.14  
4  0.49  0.80        

I convert the dataframe to numpy using df.values and then pass that to boxplot.

When I try to make a boxplot out of this pandas dataframe, the number of values picked from each column is restricted to the least number of values in a column (in this case, column F). Is there any way I can boxplot all values from each column?

NOTE: I use df.dropna to drop the rows in each column with missing values. However, this is resizing the dataframe to the lowest common denominator of column length, and messing up the plotting.

import prettyplotlib as ppl
import numpy as np
import pandas
import matplotlib as mpl
from matplotlib import pyplot

df = pandas.DataFrame.from_csv(csv_data,index_col=False)
df = df.dropna()
labels = ['A', 'B', 'C', 'D', 'E', 'F']
fig, ax = pyplot.subplots()
ppl.boxplot(ax, df.values, xticklabels=labels)
pyplot.show()
Was it helpful?

Solution

The right way to do it, saving from reinventing the wheel, would be to use the .boxplot() in pandas, where the nan handled correctly:

In [31]:

print df
      A     B     C     D     E     F
0  0.43  0.52  0.96  1.17  1.17  2.85
1  0.43  0.52  1.17  2.72  2.75  2.94
2  0.43  0.53  1.48  2.85  2.83   NaN
3  0.47  0.59  1.58   NaN  3.14   NaN
4  0.49  0.80   NaN   NaN   NaN   NaN

[5 rows x 6 columns]
In [32]:

_=plt.boxplot(df.values)
_=plt.xticks(range(1,7),labels)
plt.savefig('1.png') #keeping the nan's and plot by plt

enter image description here

In [33]:

_=df.boxplot()
plt.savefig('2.png') #keeping the nan's and plot by pandas

enter image description here

In [34]:

_=plt.boxplot(df.dropna().values)
_=plt.xticks(range(1,7),labels)
plt.savefig('3.png') #dropping the nan's and plot by plt

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top