Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame column value remapping

Tags:

python

pandas

Assuming the following DataFrame:

df = pd.DataFrame({'id': [8,16,23,8,23], 'count': [5,8,7,1,2]}, columns=['id', 'count'])

   id  count
0   8      5
1  16      8
2  23      7
3   8      1
4  23      2

...is there some Pandas magic that allows me to remap the ids so that the ids become sequential? Looking for a result like:

   id  count
0   0      5
1   1      8
2   2      7
3   0      1
4   2      2

where the original ids [8,16,23] were remapped to [0,1,2]

Note: the remapping doesn't have to maintain original order of ids. For example, the following remapping would also be fine: [8,16,23] -> [2,0,1], but the id space after remapping should be contiguous.

I'm currently using a for loop and a dict to keep track of the remapping, but it feels like Pandas might have a better solution.

like image 324
borice Avatar asked Jun 28 '26 00:06

borice


1 Answers

use factorize:

>>> df
   id  count
0   8      5
1  16      8
2  23      7
3   8      1
4  23      2
>>> df['id'] = pd.factorize(df['id'])[0]
>>> df
   id  count
0   0      5
1   1      8
2   2      7
3   0      1
4   2      2
like image 68
behzad.nouri Avatar answered Jul 01 '26 03:07

behzad.nouri



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!