Say I have a pandas dataframe with three indices 'a', 'b' and 'c' - how can I add a fourth index from an array and set its name to 'd' at the same time?
This works:
df.set_index(fourth_index, append=True, inplace=True)
df.index.set_names(['a','b','c','d'], inplace=True)
But I'm looking for something that doesn't require me to also name the first three indices again, e.g. (this doesn't work):
df.set_index({'d': fourth_index}, append=True, inplace=True)
Am I missing some function here?
Add fourth_index as a column and then call set_index. The name is preserved.
df = df.assign(d=fourth_index).set_index('d', append=True)
Note, if you're worried about memory, what you're doing is fine as is. No point sacrificing performance for a fewer characters.
Demo
df
          a   b   c   d
l1  l2                 
bar one  24  13   8   9
    two  11  30   7  23
baz one  21  31  12  30
    two   2   5  19  24
foo one  15  18   3  16
    two   2  24  28  11
qux one  23   9   6  12
    two  29  28  11  21
df.assign(l3=1).set_index('l3', append=True)
             a   b   c   d
l1  l2  l3                
bar one 1   24  13   8   9
    two 1   11  30   7  23
baz one 1   21  31  12  30
    two 1    2   5  19  24
foo one 1   15  18   3  16
    two 1    2  24  28  11
qux one 1   23   9   6  12
    two 1   29  28  11  21
                        Why not just save the names of the previous values from before i.e.
old_names = df.index.names
df.set_index(fourth_index, append=True, inplace=True)
df.index.set_names(old_names + ['d'], inplace=True)
This then keeps the benefits of the good performance and doesn't require you to re-type the old names.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With