Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you remove already filtered category values from DataFrames from plots and pivot tables?

My dataframes show video game titles, platforms, year of release, revenue, etc.

I have filtered the original dataframe "df_samplegames", which has 29 different platforms (type category), into a new dataframe "df_finalgames" that should only include the 6 platforms: '3DS', 'PSV', 'WiiU', 'PS4', 'XOne', 'PC'.

# Original dataframe:
df_samplegames['platform'].unique()

# Result:
['GEN', 'NES', 'NG', 'SCD', 'SNES', ..., '3DS', 'PSV', 'WiiU', 'PS4', 'XOne']
Length: 29
Categories (29, object): ['GEN', 'NES', 'NG', 'SCD', ..., 'PSV', 'WiiU', 'PS4', 'XOne']
# New dataframe:
df_finalgames = df_samplegames[df_samplegames['platform'].isin(
    ['3DS', 'PSV', 'WiiU', 'PS4', 'XOne', 'PC'])].sort_values(by = ['year', 'platform']).reset_index(drop = True)
df_finalgames['platform'].unique()

# Result:
['PC', '3DS', 'PSV', 'WiiU', 'PS4', 'XOne']
Categories (6, object): ['PC', '3DS', 'PSV', 'WiiU', 'PS4', 'XOne']

But whenever I want to use the latter dataframe to create a pairplot, or a pivot table, it keeps including the filtered out platform categories.

# Pivot table creation:
df_platgames = df_finalgames[['name', 'platform', 'total_sales']]
pvt_platgames = df_platgames.pivot_table(
    values = 'total_sales', index = 'name', columns = 'platform', aggfunc = 'sum')
pvt_platgames

# Result, note the column headers:
platform    2600    3DO 3DS DC  DS  GB  GBA GC  GEN GG  ... SAT SCD SNES    TG16    WS  Wii WiiU    X360    XB  XOne
name                                                                                    
Beyblade Burst  0.0 0.0 0.03    0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Fire Emblem Fates   0.0 0.0 1.67    0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Frozen: Olaf's Quest    0.0 0.0 0.59    0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
# Etc...

How can I force my code to ignore those platforms that I don't need? Thank you all in advance.

Sample Data for df_samplegames:

        name    platform    year    genre   na_sales    eu_sales    jp_sales    other_sales critic_score    user_score  rating  total_sales
9706    Press Your Luck 2010 Edition    DS  2009    Misc    0.18    0.00    0.00    0.01    68.967679   7.125046    E   0.19
3526    Stadium Games   GBA 2003    Sports  0.03    0.01    0.00    0.00    68.967679   7.125046    RP  0.04
1865    Evil Dead: Hail to the King PS  2000    Adventure   0.15    0.10    0.00    0.02    51.000000   6.000000    M   0.27
5117    Connect Four / Perfection / Trouble GBA 2005    Misc    0.19    0.07    0.00    0.00    68.967679   7.125046    RP  0.26
340 Rampo   SAT 1995    Adventure   0.00    0.00    0.03    0.00    68.967679   7.125046    RP  0.03

Thank you

like image 792
RicardoDLM Avatar asked Nov 15 '25 15:11

RicardoDLM


1 Answers

Various methods (such as pivot_table) will include “unused” categories by default.

  • https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html#operations
df2.pivot_table(index="name", columns="platform", values="total_sales", 
                aggfunc="sum")
# platform  w  x  y  z
# name
# b         0  2  0  0
# d         0  0  0  4

observed=True will only include used categories.

df2.pivot_table(index="name", columns="platform", values="total_sales", 
                aggfunc="sum", observed=True)
# platform    x    z
# name
# b         2.0  NaN
# d         NaN  4.0

Your code should actually warn (on at least v2.3.2) as the default will change.

FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior

Example dfs

import pandas as pd

df1 = pd.DataFrame({
    "name": ["a", "b", "c", "d"],
    "platform": ["w", "x", "y", "z"],
    "total_sales": [1, 2, 3, 4]
})

df1 = df1.astype({"platform": "category"})
df2 = df1[df1["platform"].isin(["x", "z"])]
like image 130
jqurious Avatar answered Nov 17 '25 08:11

jqurious



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!