How to make a boxplot where each row in my dataframe object is a box in the plot?
I have some stock data that I want to plot with a box plot. My data is from yahoo finance and includes Open, High, Low, Close, Adjusted Close and Volume data for each trading day. I want to plot a box plot where each box is 1 day of OHLC price action.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pandas.io.data import DataReader
# get daily stock price data from yahoo finance for S&P500
SP = DataReader("^GSPC", "yahoo")
SP.head()
Open High Low Close Volume Adj Close
Date
2010-01-04 1116.56 1133.87 1116.56 1132.99 3991400000 1132.99
2010-01-05 1132.66 1136.63 1129.66 1136.52 2491020000 1136.52
2010-01-06 1135.71 1139.19 1133.95 1137.14 4972660000 1137.14
2010-01-07 1136.27 1142.46 1131.32 1141.69 5270680000 1141.69
2010-01-08 1140.52 1145.39 1136.22 1144.98 4389590000 1144.98
plt.figure()
bp = SP.boxplot()
But when I plot this data frame as a boxplot, I only get one box with the Open, High, Low, and Close values of the entire Volume column.
Likewise, I try re-sampling my Adjusted Close daily price data to get weekly OHLC:
close = SP['Adj Close']
wk = close.resample('W', how='ohlc')
wk.head()
open high low close
Date
2010-01-10 1132.99 1144.98 1132.99 1144.98
2010-01-17 1146.98 1148.46 1136.03 1136.03
2010-01-24 1150.23 1150.23 1091.76 1091.76
2010-01-31 1096.78 1097.50 1073.87 1073.87
2010-02-07 1089.19 1103.32 1063.11 1066.19
This yields a Box Plot with 4 Boxes. Each box is the range of each column, not row. So for example, the first Box, 'open', shows the Open, Close, High and Low of the entire 'open' Column.
But what I actually want is 1 box for each 'Date' (index or row of my DataFrame). So the first Box will show the OHLC of the first row, '2010-01-10'. Second box will be the second row ('2010-01-17').
What I really want though is each row in my original Daily data (SP DataFrame) is its own OHLC Box. Essentially I want daily candlesticks, generated as a boxplot().
Open High Low Close
Date
2010-01-04 1116.56 1133.87 1116.56 1132.99
How do I do this using the Pandas DataFrame and Matplotlib boxplot()? I just want a basic boxplot plot where each row from the DataFrame is a OHLC box in the plot. Nothing fancy at this point. Thanks!