Python Pandas differing value_counts() in two columns of same len()

Question

I have a pandas data frame that contains two columns, with trace numbers [col_1] and ID numbers [col_2]. Trace numbers can be duplicates, as can ID numbers - however, each trace & ID should correspond only a specific fellow in the adjacent column.

Each of my two columns are the same length, but have different unique value counts, which should be the same, as shown below:

in[1]:  Trace | ID
        1     | 5054
        2     | 8291
        3     | 9323
        4     | 9323
        ...   |
        100   | 8928

in[2]:  print('unique traces: ', df['Trace'].value_counts())
        print('unique IDs: ', df['ID'].value_counts())

out[3]: unique traces: 100
        unique IDs: 99

In the code above, the same ID number (9232) is represented by two Trace numbers (3 & 4) - how can I isolate these incidences? Thanks for looking!

DocZerø · Accepted Answer

By using the duplicated() function (docs), you can do the following:

df[df['ID'].duplicated(keep=False)]

By setting keep to False, we get all the duplicates (instead of excluding the first or the last one).

Which returns:

Trace   ID
2   3   9323
3   4   9323

Python Pandas differing value_counts() in two columns of same len()

Tags:

python

python-3.x

pandas

dataframe

tmdangerous

1 Answers

DocZerø

Recent Activity

Donate For Us

Python Pandas differing value_counts() in two columns of same len()

Tags:

python

python-3.x

pandas

dataframe

tmdangerous

1 Answers

DocZerø

Related questions

Recent Activity

Donate For Us