Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reindexing error makes no sense

I have DataFrames between 100k and 2m in size. the one I am dealing with for this question is this large, but note that I will have to do the same for the other frames:

>>> len(data)
357451

now this file was created by compiling many files, so the index for it is really odd. So all I wanted to do was reindex it with range(len(data)), but I get this error:

>>> data.reindex(index=range(len(data)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 2542, in reindex
    fill_value, limit)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 2618, in _reindex_index
    limit=limit)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 893, in reindex
    limit=limit)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 812, in get_indexer
    raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects

This actually makes no sense. Since I am reindexing with an array containing numbers 0 through 357450, all Index objects are unique! Why is it returning this error?

Extra info: I am using python2.7 and pandas 11.0

like image 638
Ryan Saxe Avatar asked May 01 '13 22:05

Ryan Saxe


1 Answers

When it complains that Reindexing only valid with uniquely valued Index, it's not objecting that your new index isn't unique, it's objecting that your old one isn't.

For example:

>>> df = pd.DataFrame(range(5), index = [1,2,3,1,2])
>>> df
   0
1  0
2  1
3  2
1  3
2  4
>>> df.reindex(index=range(len(df)))
Traceback (most recent call last):
[...]
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.12.0.dev_0bd5e77-py2.7-linux-i686.egg/pandas/core/index.py", line 849, in get_indexer
    raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects

but

>>> df.index = range(len(df))
>>> df
   0
0  0
1  1
2  2
3  3
4  4

Although I think I'd write

df.reset_index(drop=True)

instead.

like image 186
DSM Avatar answered Oct 06 '22 01:10

DSM