python pandas remove duplicates in series

Question

Is there a function to enforce that the index is unique or is it only possibly to handle this in python 'itself' by converting to dict and back or something like that?

As noted in the comments below: python pandas is a project built on numpy/scipy.

to_dict and back works, but I bet this gets slow when you get BIG.

In [24]: a = pandas.Series([1,2,3], index=[1,1,2])

In [25]: a
Out[25]: 
1    1
1    2
2    3

In [26]: a = a.to_dict()

In [27]: a
Out[27]: {1: 2, 2: 3}

In [28]: a = pandas.Series(a)

In [29]: a
Out[29]: 
1    2
2    3

Wes McKinney · Accepted Answer

BTW we plan on adding a drop_duplicates method to Series like DataFrame.drop_duplicates in the near future.

root · Answer

Use groupby and last()

In [279]: s
Out[279]: 
a    1
b    2
b    3
b    4
e    5

In [280]: grouped = s.groupby(level=0)

In [281]: grouped.first()
Out[281]: 
a    1
b    2
e    5

In [282]: grouped.last()
Out[282]: 
a    1
b    4
e    5

python pandas remove duplicates in series

Tags:

python

pandas

mathtick

2 Answers

Wes McKinney

root

Recent Activity

Donate For Us

python pandas remove duplicates in series

Tags:

python

pandas

mathtick

2 Answers

Wes McKinney

root

Related questions

Recent Activity

Donate For Us