Question

first, if I use DataReader to read in the data, and then plot, everything is good

In [55]: t = DataReader('SPY','yahoo', start=datetime.datetime(1990,1,1))

In [56]: t
Out[56]: 
<class 'pandas.core.frame.DataFrame'>
Index: 5033 entries, 1993-01-29 00:00:00 to 2013-01-23 00:00:00
Data columns:
Open         5033  non-null values
High         5033  non-null values
Low          5033  non-null values
Close        5033  non-null values
Volume       5033  non-null values
Adj Close    5033  non-null values
dtypes: float64(5), int64(1)

In [58]: t.plot()
Out[58]: <matplotlib.axes.AxesSubplot at 0x8cca790>

However, if I save it as a csv file and reload it again, I got error message and the plot is not quite right either,

In [62]: t.to_csv('spy.csv')

In [63]: s = pd.read_csv('spy.csv', na_values=[" "])

In [64]: s.set_index('Date')
Out[64]: 
<class 'pandas.core.frame.DataFrame'>
Index: 5033 entries, 1993-01-29 00:00:00 to 2013-01-23 00:00:00
Data columns:
Open         5033  non-null values
High         5033  non-null values
Low          5033  non-null values
Close        5033  non-null values
Volume       5033  non-null values
Adj Close    5033  non-null values
dtypes: float64(5), int64(1)

In [66]: s.plot()                                                            
--------------------------------------------------------------------------- 
AttributeError                            Traceback (most recent call last) 
/home/dli/pythonTest/pandas/<ipython-input-66-d3eb09d34df4> in <module>()   
----> 1 s.plot()                                                            

/usr/lib/pymodules/python2.7/pandas/core/frame.pyc in plot(self, subplots, sharex, sharey, use_index, figsize, grid, legend, rot, ax, kind, **kwds)
3748                     ax.legend(loc='best')                              
3749                 else:                                                  
-> 3750                     ax.plot(x, y, label=str(col), **kwds)           
3751                                                                        
3752                 ax.grid(grid)                                          

/usr/lib/pymodules/python2.7/matplotlib/axes.pyc in plot(self, *args, **kwargs)
3891         lines = []                                                     
3892                                                                        
-> 3893         for line in self._get_lines(*args, **kwargs):               
3894             self.add_line(line)                                        
3895             lines.append(line)                                         

/usr/lib/pymodules/python2.7/matplotlib/axes.pyc in _grab_next_args(self, *args, **kwargs)
    320                 return                                              
    321             if len(remaining) <= 3:                                 
--> 322                 for seg in self._plot_args(remaining, kwargs):      
    323                     yield seg                                       
    324                 return                                              

/usr/lib/pymodules/python2.7/matplotlib/axes.pyc in _plot_args(self, tup, kwargs)
    279         ret = []                                                    
    280         if len(tup) > 1 and is_string_like(tup[-1]):                
--> 281             linestyle, marker, color = _process_plot_format(tup[-1])
    282             tup = tup[:-1]                                          
    283         elif len(tup) == 3:                                         

/usr/lib/pymodules/python2.7/matplotlib/axes.pyc in _process_plot_format(fmt)
    93     # handle the multi char special cases and strip them from the    

    94     # string                                                         

---> 95     if fmt.find('--')>=0:                                           
    96         linestyle = '--'                                             
    97         fmt = fmt.replace('--', '')                                  

AttributeError: 'numpy.ndarray' object has no attribute 'find'            

Any idea how to fix it?

Thanks. Dan

Was it helpful?

Solution

The set_index method returns a new DataFrame by default, rather than applying this inplace (in fact, most pandas functions are similar). It has an inplace argument:

s.set_index('Date', inplace=True)
s.plot()

which works as you intended!

Note: to convert the Index to a DatetimeIndex you can use to_datetime:

s.index = s.index.to_datetime()

.

Which is to say, s remained unchanged by you .set_index('Date'):

In [63]: s = pd.read_csv('spy.csv', na_values=[" "])

In [64]: s.set_index('Date')
Out[64]: 
<class 'pandas.core.frame.DataFrame'>
Index: 5033 entries, 1993-01-29 00:00:00 to 2013-01-23 00:00:00
Data columns:
Open         5033  non-null values
High         5033  non-null values
Low          5033  non-null values
Close        5033  non-null values
Volume       5033  non-null values
Adj Close    5033  non-null values
dtypes: float64(5), int64(1)

In [65]: s
Out[65]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5033 entries, 0 to 5032
Data columns:
Date         5033  non-null values
Open         5033  non-null values
High         5033  non-null values
Low          5033  non-null values
Close        5033  non-null values
Volume       5033  non-null values
Adj Close    5033  non-null values
dtypes: float64(5), int64(1), object(1)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top