Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How rename pd.value_counts() index with a correspondance dictionary

I am doing a value_counts() over a column of integers that represent categorical values.

I have a dict that maps the numbers to strings that correspond to the category name.

I want to find the best way to have the index with the corresponding name. As I am not happy with my 4 lines solution.

My current solution

df = pd.DataFrame({"weather": [1,2,1,3]})
df
>>>
   weather
0        1
1        2
2        1
3        3

weather_correspondance_dict = {1:"sunny", 2:"rainy", 3:"cloudy"}

Now how I solve the problem:

df_vc = df.weather.value_counts()
index = df_vc.index.map(lambda x: weather_correspondance_dict[x] )
df_vc.index = index
df_vc
>>>
sunny     2
cloudy    1
rainy     1
dtype: int64

Question

I am not happy with that solution that is very tedious, do you have a best practice for that situation ?

like image 428
Adrien Pacifico Avatar asked Jul 26 '18 10:07

Adrien Pacifico


2 Answers

This is my solution :

>>> weather_correspondance_dict = {1:"sunny", 2:"rainy", 3:"cloudy"}
>>> df["weather"].value_counts().rename(index=weather_correspondance_dict)
    sunny     2
    cloudy    1
    rainy     1
    Name: weather, dtype: int64
like image 184
dimension Avatar answered Sep 20 '22 07:09

dimension


Here's a simpler solution:

weathers = ['sunny', 'rainy', 'cloudy']
weathers_dict = dict(enumerate(weathers, 1))

df_vc = df['weather'].value_counts()
df_vc.index = df_vc.index.map(weathers_dict.get)

Explanation

  • Use dict with enumerate to construct a dictionary mapping integers to a list of weather types.
  • Use dict.get with pd.Index.map. Unlike pd.Series.apply, you cannot pass a dictionary directly, but you can pass a callable function instead.
  • Update the index directly rather than using an intermediary variable.

Alternatively, you can apply your map to weather before using pd.Series.value_counts. This way, you do not need to update the index of your result.

df['weather'] = df['weather'].map(weathers_dict)
df_vc = df['weather'].value_counts()
like image 33
jpp Avatar answered Sep 21 '22 07:09

jpp