Question

import datetime   
import pandas.io.data

sp  =  pd.io.data.get_data_yahoo('^IXIC',start = datetime.datetime(1972, 1, 3),
                       end = datetime.datetime(2010, 1, 3))

I have used the above example, but that just pulls DAILY data into a dataframe when I would like to pull weekly. It doesn't seem like get_data_yahoo has a parameter where you can select perhaps from daily, weekly or monthly like the options made available on yahoo itself. Any other packages or ideas that you know of that might be able to facilitate this?

Was it helpful?

Solution

You can downsample using the asfreq method:

sp = sp.asfreq('W-FRI', method='pad')

The pad method will propagate the last valid observation forward.

Using resample (as @tshauck has shown) is another possibility. Use asfreq if you want to guarantee that the values in your downsample are values found in the original data set. Use resample if you wish to aggregate groups of rows from the original data set (for example, by taking a mean). reindex might introduce NaN values if the original data set does not have a value on the date specified by the reindex -- though (as @behzad.nouri points out) you could use method=pad to propagate last observations here as well.

OTHER TIPS

If you check the latest pandas source code on github, you will see that interval param is included in the latest master branch. You can manually modify your local copy by overwriting the same data.py under your Site-Packages/pandas/io folder

you can always reindex to your desired frequency:

sp.reindex( pd.date_range( start=sp.index.min( ),
                           end=sp.index.max( ),
                           freq='W-WED' ) )  # weekly, Wednesdays

edit: you may add , method='ffill' to forward fill NaN values.

As a suggestion, take Wednesdays because that tend to have least missing values. ( i.e. fewer NYSE holidays falls on Wednesday ). I think Yahoo weekly data gives the stock price each Monday, which is worst weekly frequency based on S&P data from 2000 onwards:

import pandas.io.data as web
sp = web.DataReader("^GSPC", "yahoo", start=dt.date( 2000, 1, 1 ) )

weekday = { 0:'MON', 1:'TUE', 2:'WED', 3:'THU', 4:'FRI' }
sp[ 'weekday' ] = list( map( weekday.get, sp.index.dayofweek ) )
sp.weekday.value_counts( )

output:

WED    722
TUE    717
THU    707
FRI    705
MON    659

One option would be to mask on the day of week you want.

sp[sp.index.dayofweek == 0]

Another option would be to resample.

sp.resample('W', how='mean')

That's how I convert daily to weekly price data:

import datetime
import pandas as pd
import pandas_datareader.data as web

start = datetime.datetime(1972, 1, 3)
end = datetime.datetime(2010, 1, 3)

stock_d = web.DataReader('^IXIC', 'yahoo', start, end)

def week_open(array_like):
    return array_like[0]

def week_close(array_like):
    return array_like[-1]

stock_w = stock_d.resample('W',
                    how={'Open': week_open, 
                         'High': 'max',
                         'Low': 'min',
                         'Close': week_close,
                         'Volume': 'sum'}, 
                    loffset=pd.offsets.timedelta(days=-6))

stock_w = stock_w[['Open', 'High', 'Low', 'Close', 'Volume']]

more info:

https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#yahoo-finance https://gist.github.com/prithwi/339f87bf9c3c37bb3188

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top