Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas/Python equivalent of Stata's "levelsof"

I want to get a list of all distinct or unique values of one variable in a dataframe that coincide with a specific value of another variable in that dataframe.

In Stata I would use something like:

levelsof(ID1) if ID2==i

How do I do this in Python?

like image 743
user3784956 Avatar asked Apr 12 '26 23:04

user3784956


2 Answers

Stata's levelsof is equivalent to pandas's unique(). They both return an array of unique or distinct values.

>>> df=pd.DataFrame({ 'id1':[0,0,1,1,2,2], 
                      'id2':[5,5,5,6,6,6] })

   id1  id2
0    0    5
1    0    5
2    1    5
3    1    6
4    2    6
5    2    6

>>> df.loc[ df['id2'] == 5, 'id1' ].unique()

array([0, 1])
like image 164
JohnE Avatar answered Apr 14 '26 12:04

JohnE


Say your columns are ID1 and ID2, and the DataFrame is df. Then

df.ID1[df.ID2 == i] 

will give all the values of the first column where the second one is i.

Following that, you can do

df.ID1[df.ID2 == i].value_counts()

to get a breakdown,

df.ID1[df.ID2 == i].unique()

to get unique values,

df.ID1[df.ID2 == i].describe()

to get a description, and so forth (I don't know what levelsof is exactly).

like image 27
Ami Tavory Avatar answered Apr 14 '26 13:04

Ami Tavory



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!