Say I have a pandas dataframe with three indices 'a', 'b' and 'c' - how can I add a fourth index from an array and set its name to 'd' at the same time?
This works:
df.set_index(fourth_index, append=True, inplace=True)
df.index.set_names(['a','b','c','d'], inplace=True)
But I'm looking for something that doesn't require me to also name the first three indices again, e.g. (this doesn't work):
df.set_index({'d': fourth_index}, append=True, inplace=True)
Am I missing some function here?
Add fourth_index
as a column and then call set_index
. The name is preserved.
df = df.assign(d=fourth_index).set_index('d', append=True)
Note, if you're worried about memory, what you're doing is fine as is. No point sacrificing performance for a fewer characters.
Demo
df
a b c d
l1 l2
bar one 24 13 8 9
two 11 30 7 23
baz one 21 31 12 30
two 2 5 19 24
foo one 15 18 3 16
two 2 24 28 11
qux one 23 9 6 12
two 29 28 11 21
df.assign(l3=1).set_index('l3', append=True)
a b c d
l1 l2 l3
bar one 1 24 13 8 9
two 1 11 30 7 23
baz one 1 21 31 12 30
two 1 2 5 19 24
foo one 1 15 18 3 16
two 1 2 24 28 11
qux one 1 23 9 6 12
two 1 29 28 11 21
Why not just save the names of the previous values from before i.e.
old_names = df.index.names
df.set_index(fourth_index, append=True, inplace=True)
df.index.set_names(old_names + ['d'], inplace=True)
This then keeps the benefits of the good performance and doesn't require you to re-type the old names.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With