Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas plot time series ['numpy.ndarray' object has no attribute 'find']

I have the following code attempting to plot a timeseries. Note, I drop the second column because it's not relevant. And I drop the first and last rows.

import pandas as pd

activity = pd.read_csv('activity.csv', index_col=2)
activity = activity.ix[1:-1] #drop first and last rows due to invalid data
series = activity['activity']
series.plot()

I get the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-36df40c11065> in <module>()
----> 1 series.plot()

.../pandas/tools/plotting.pyc in plot_series(series, label, kind, use_index,
                                             rot, xticks, yticks, xlim, ylim,
                                             ax, style, grid, logy,
                                             secondary_y, **kwds)
   1326                      secondary_y=secondary_y, **kwds)
   1327 
-> 1328     plot_obj.generate()
   1329     plot_obj.draw()
   1330 

.../pandas/tools/plotting.pyc in generate(self)
    573         self._compute_plot_data()
    574         self._setup_subplots()
--> 575         self._make_plot()
    576         self._post_plot_logic()
    577         self._adorn_subplots()

.../pandas/tools/plotting.pyc in _make_plot(self)
    916                     args = (ax, x, y, style)
    917 
--> 918                 newline = plotf(*args, **kwds)[0]
    919                 lines.append(newline)
    920                 leg_label = label

.../matplotlib/axes.pyc in plot(self, *args, **kwargs)
   3991         lines = []
   3992 
-> 3993         for line in self._get_lines(*args, **kwargs):
   3994             self.add_line(line)
   3995             lines.append(line)

.../matplotlib>/axes.pyc in _grab_next_args(self, *args, **kwargs)
    328                 return
    329             if len(remaining) <= 3:
--> 330                 for seg in self._plot_args(remaining, kwargs):
    331                     yield seg
    332                 return

.../matplotlib/axes.pyc in _plot_args(self, tup, kwargs)
    287         ret = []
    288         if len(tup) > 1 and is_string_like(tup[-1]):
--> 289             linestyle, marker, color = _process_plot_format(tup[-1])
    290             tup = tup[:-1]
    291         elif len(tup) == 3:

.../matplotlib/axes.pyc in _process_plot_format(fmt)
     94     # handle the multi char special cases and strip them from the
     95     # string
---> 96     if fmt.find('--')>=0:
     97         linestyle = '--'
     98         fmt = fmt.replace('--', '')

AttributeError: 'numpy.ndarray' object has no attribute 'find'

If I try it with a small dataset such as:

target, weekday, timestamp
0, Sat, 08 Dec 2012 16:26:26:625000
0, Sat, 08 Dec 2012 16:26:27:625000
0, Sat, 08 Dec 2012 16:26:28:625000
0, Sat, 08 Dec 2012 16:26:29:625000
1, Sat, 08 Dec 2012 16:26:30:625000
2, Sat, 08 Dec 2012 16:26:31:625000
0, Sat, 08 Dec 2012 16:26:32:625000
0, Sat, 08 Dec 2012 16:26:33:625000
1, Sat, 08 Dec 2012 16:26:34:625000
2, Sat, 08 Dec 2012 16:26:35:625000

it works, but not on my full dataset. https://dl.dropbox.com/u/60861504/activity.csv Also I tried it with the first 10 items from my dataset and got the same error, but if I assign one value manually series[10] = 5 the plot shows up. I'm stumped.

like image 510
Harry Moreno Avatar asked Mar 22 '13 21:03

Harry Moreno


2 Answers

The answer is in the error message:

AttributeError: 'numpy.ndarray' object has no attribute 'find'

The inferred datatype of your series is string (try type(series[0]))

If you first convert the datatype:

series = series.astype(int)
series.plot()

should work.

like image 200
herrfz Avatar answered Oct 30 '22 08:10

herrfz


In my experience this happens because of non numeric columns in the dataframe.

pd.read_csv tries to infer datatype of the columns - I suspect your corrupted columns might be confusing this process and you end up with columns of non numeric types in your data frame

like image 31
user1827356 Avatar answered Oct 30 '22 07:10

user1827356