I want to iterate over a dataframe's major axis date by date.
Example:
tdf = df.ix[date]
The issue I am having is that the type returned by df.ix
changes, leaving me with 3 possible situations
If the date does not exist in tdf
an error is thrown: KeyError: 1394755200000000000
If there is only one item in tdf
: print type(tdf)
returns
<class 'pandas.core.series.Series'>
If there is more than one item in tdf
: print type(tdf)
returns
<class 'pandas.core.frame.DataFrame'>
To avoid the first case I can simply wrap this in a try catch
block or thanks to jxstanford, I can avoid the try catch block by using if date in df.index:
I run into the issue afterwards with an inconsistent API with a pandas series and a pandas data frame. I could solve this by checking for types but it seems I shouldn't have to do that. I would ideally like to keep the types the same. Is there a better way of doing this?
I'm running pandas 0.13.1 and I am currently loading my data from a CSV using
Here's a full example demonstrating the problem.
from pandas import DataFrame
import datetime
path_to_csv = '/home/n/Documents/port/test.csv'
df = DataFrame.from_csv(path_to_csv, index_col=3, header=0, parse_dates=True, sep=',')
start_dt = df.index.min()
end_dt = df.index.max()
dt_step = datetime.timedelta(days=1)
df.sort_index(inplace=True)
cur_dt = start_dt
while cur_dt != end_dt:
if cur_dt in df.index:
print type(df.ix[cur_dt])
#run some other steps using cur_dt
cur_dt += dt_step
An example CSV that demonstrates the problem is as follows:
value1,value2,value3,Date,type
1,2,4,03/13/14,a
2,3,3,03/21/14,b
3,4,2,03/21/14,a
4,5,1,03/27/14,b
The above code prints out
<class 'pandas.core.series.Series'>
<class 'pandas.core.frame.DataFrame'>
Is it possible to get the value of value1
from tdf in a consistent manner? or am I stuck making an if statement for and separately handle each case?
if type(df.ix[cur_dt]) == DataFrame:
....
if type(df.ix[cur_dt]) == Series:
....