I've created a small dataset comparing coffee drink prices per cup size.
When I pivot my dataset the output automatically reorders the index (the 'Size' column) alphabetically.
Is there a way to assign the different sizes a numerical level (e.g. small = 0, medium = 1, large = 2) and reorder the rows this way instead?
I'm know this can be done in R using the forcats library (using fct_relevel for example), but I'm not aware of how to do this in python. I would prefer to keep the solution to using numpy and pandas.
data = {'Item': np.repeat(['Latte', 'Americano', 'Cappuccino'], 3),
'Size': ['Small', 'Medium', 'Large']*3,
'Price': [2.25, 2.60, 2.85, 1.95, 2.25, 2.45, 2.65, 2.95, 3.25]
}
df = pd.DataFrame(data, columns = ['Item', 'Size', 'Price'])
df = pd.pivot_table(df, index = ['Size'], columns = 'Item')
df
# Price
# Item Americano Cappuccino Latte
# Size
# Large 2.45 3.25 2.85
# Medium 2.25 2.95 2.60
# Small 1.95 2.65 2.25
You can use a Categorical
type with ordered=True
:
df.index = pd.Categorical(df.index,
categories=['Small', 'Medium', 'Large'],
ordered=True)
df = df.sort_index()
output:
Price
Item Americano Cappuccino Latte
Small 1.95 2.65 2.25
Medium 2.25 2.95 2.60
Large 2.45 3.25 2.85
You can access the codes with:
>>> df.index.codes
array([0, 1, 2], dtype=int8)
If this was a Series:
>>> series.cat.codes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With