Why is pd.unique() faster than np.unique()?

Question

I tried to compare the two, one is pandas.unique() and another one is numpy.unique(), and I found out that the latter actually surpass the first one.
I am not sure whether the excellency is linear or not.

Can anyone please tell me why such a difference exists, with regards to the code implementation? In what case should I use which?

Dylan McCullough · Accepted Answer

np.unique() is treating the data as an array, so it goes through every value individually then identifies the unique fields.

whereas, pandas has pre-built metadata which contains this information and pd.unique() is simply calling on the metadata which contains 'unique' info, so it doesn't have to calculate it again.

Why is pd.unique() faster than np.unique()?

Tags:

python

pandas

numpy

data-analysis

data-science

Songcheng Li

1 Answers

Dylan McCullough

Recent Activity

Donate For Us

Why is pd.unique() faster than np.unique()?

Tags:

python

pandas

numpy

data-analysis

data-science

Songcheng Li

1 Answers

Dylan McCullough

Related questions

Recent Activity

Donate For Us