I have the following code attempting to plot a timeseries. Note, I drop the second column because it's not relevant. And I drop the first and last rows.
import pandas as pd
activity = pd.read_csv('activity.csv', index_col=2)
activity = activity.ix[1:-1] #drop first and last rows due to invalid data
series = activity['activity']
series.plot()
I get the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-10-36df40c11065> in <module>()
----> 1 series.plot()
.../pandas/tools/plotting.pyc in plot_series(series, label, kind, use_index,
rot, xticks, yticks, xlim, ylim,
ax, style, grid, logy,
secondary_y, **kwds)
1326 secondary_y=secondary_y, **kwds)
1327
-> 1328 plot_obj.generate()
1329 plot_obj.draw()
1330
.../pandas/tools/plotting.pyc in generate(self)
573 self._compute_plot_data()
574 self._setup_subplots()
--> 575 self._make_plot()
576 self._post_plot_logic()
577 self._adorn_subplots()
.../pandas/tools/plotting.pyc in _make_plot(self)
916 args = (ax, x, y, style)
917
--> 918 newline = plotf(*args, **kwds)[0]
919 lines.append(newline)
920 leg_label = label
.../matplotlib/axes.pyc in plot(self, *args, **kwargs)
3991 lines = []
3992
-> 3993 for line in self._get_lines(*args, **kwargs):
3994 self.add_line(line)
3995 lines.append(line)
.../matplotlib>/axes.pyc in _grab_next_args(self, *args, **kwargs)
328 return
329 if len(remaining) <= 3:
--> 330 for seg in self._plot_args(remaining, kwargs):
331 yield seg
332 return
.../matplotlib/axes.pyc in _plot_args(self, tup, kwargs)
287 ret = []
288 if len(tup) > 1 and is_string_like(tup[-1]):
--> 289 linestyle, marker, color = _process_plot_format(tup[-1])
290 tup = tup[:-1]
291 elif len(tup) == 3:
.../matplotlib/axes.pyc in _process_plot_format(fmt)
94 # handle the multi char special cases and strip them from the
95 # string
---> 96 if fmt.find('--')>=0:
97 linestyle = '--'
98 fmt = fmt.replace('--', '')
AttributeError: 'numpy.ndarray' object has no attribute 'find'
If I try it with a small dataset such as:
target, weekday, timestamp
0, Sat, 08 Dec 2012 16:26:26:625000
0, Sat, 08 Dec 2012 16:26:27:625000
0, Sat, 08 Dec 2012 16:26:28:625000
0, Sat, 08 Dec 2012 16:26:29:625000
1, Sat, 08 Dec 2012 16:26:30:625000
2, Sat, 08 Dec 2012 16:26:31:625000
0, Sat, 08 Dec 2012 16:26:32:625000
0, Sat, 08 Dec 2012 16:26:33:625000
1, Sat, 08 Dec 2012 16:26:34:625000
2, Sat, 08 Dec 2012 16:26:35:625000
it works, but not on my full dataset. https://dl.dropbox.com/u/60861504/activity.csv
Also I tried it with the first 10 items from my dataset and got the same error, but if I assign one value manually series[10] = 5
the plot shows up. I'm stumped.
The answer is in the error message:
AttributeError: 'numpy.ndarray' object has no attribute 'find'
The inferred datatype of your series is string (try type(series[0])
)
If you first convert the datatype:
series = series.astype(int)
series.plot()
should work.
In my experience this happens because of non numeric columns in the dataframe.
pd.read_csv tries to infer datatype of the columns - I suspect your corrupted columns might be confusing this process and you end up with columns of non numeric types in your data frame
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With