Pandas/Python equivalent of Stata's "levelsof"

Question

I want to get a list of all distinct or unique values of one variable in a dataframe that coincide with a specific value of another variable in that dataframe.

In Stata I would use something like:

levelsof(ID1) if ID2==i

How do I do this in Python?

JohnE · Accepted Answer

Stata's levelsof is equivalent to pandas's unique(). They both return an array of unique or distinct values.

>>> df=pd.DataFrame({ 'id1':[0,0,1,1,2,2], 
                      'id2':[5,5,5,6,6,6] })

   id1  id2
0    0    5
1    0    5
2    1    5
3    1    6
4    2    6
5    2    6

>>> df.loc[ df['id2'] == 5, 'id1' ].unique()

array([0, 1])

Ami Tavory · Answer

Say your columns are ID1 and ID2, and the DataFrame is df. Then

df.ID1[df.ID2 == i]

will give all the values of the first column where the second one is i.

Following that, you can do

df.ID1[df.ID2 == i].value_counts()

to get a breakdown,

df.ID1[df.ID2 == i].unique()

to get unique values,

df.ID1[df.ID2 == i].describe()

to get a description, and so forth (I don't know what levelsof is exactly).

Pandas/Python equivalent of Stata's "levelsof"

Tags:

python

pandas

stata

user3784956

2 Answers

JohnE

Ami Tavory

Recent Activity

Donate For Us

Pandas/Python equivalent of Stata's "levelsof"

Tags:

python

pandas

stata

user3784956

2 Answers

JohnE

Ami Tavory

Related questions

Recent Activity

Donate For Us