Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get indexes of unique values in column (pandas)

Tags:

python

pandas

I need to get row numbers with unique values in x. I've come to the following solution:

x = pv.index.get_level_values("Код") #get index level values
dups = x[x.duplicated()].unique() #get dup. values
uniques = x[~x.isin(dups)] #get not dup. values
uniques_indexes = np.where(x.isin(uniques))[0].tolist()

I think there's too much calculations. Is there any better solution?

like image 993
Winand Avatar asked Sep 27 '22 12:09

Winand


People also ask

How do I get a list of unique values from a column in pandas?

You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.

How do I list unique values in a column in Python?

To get unique values from a column in a DataFrame, use the unique(). To count the unique values from a column in a DataFrame, use the nunique().

How do I extract unique rows in pandas?

And you can use the following syntax to select unique rows across specific columns in a pandas DataFrame: df = df. drop_duplicates(subset=['col1', 'col2', ...])

How do I extract unique values from a DataFrame?

unique() Function to Get Unique Values From a Dataframe. The pandas. unique() function returns the unique values present in a dataset. It basically uses a technique based on hash tables to return the non-redundant values from the set of values present in the data frame/series data structure.


1 Answers

import pandas as pd
import numpy as np

np.random.seed(100)
index = np.random.choice('A B C D E F G'.split(), 10)
pv = pd.DataFrame(np.random.randn(10), index=index, columns=['value'])

Out[60]: 
    value
A -0.2347
A -1.4397
D  0.4328
A  2.3045
C -0.1226
G  0.0155
E  0.2660
C -0.1138
F  1.0111
C -1.4408

# reset_index first to preserve the line number
pv.reset_index(inplace=True)

Out[128]: 
  index   value
0     A -0.2347
1     A -1.4397
2     D  0.4328
3     A  2.3045
4     C -0.1226
5     G  0.0155
6     E  0.2660
7     C -0.1138
8     F  1.0111
9     C -1.4408

# replace your groupby index level
pv.sort_index().groupby('index').filter(lambda group: len(group) == 1)


Out[129]: 
  index   value
2     D  0.4328
5     G  0.0155
6     E  0.2660
8     F  1.0111
like image 148
Jianxun Li Avatar answered Oct 28 '22 17:10

Jianxun Li