Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas assign series to new column to multiindex

So I create a dataframe with MultiIndex

df = pd.DataFrame({
    'C1': ['x', 'x', 'y', 'y', 'z', 'z'],
    'C2': ['a', 'b', 'a', 'b', 'a', 'b'],
    'C3': [10, 11, 12, 13, 14, 15]})
df.set_index(['C1', 'C2'], inplace=True)

And I get the following dataframe

       C3
C1 C2    
x  a   10
   b   11
y  a   12
   b   13
z  a   14
   b   15

I also have a series that has same index of C2:

series = pd.Series([100], index=['a'])

I would like to assign this series to a new column, C4, only to the 'x' first index. It works if I use .assign, but it returns a copy:

df.loc['x'].assign(C4=series)

and I obtain

    C3     C4
C2           
a   10  100.0
b   11    NaN

but I fail to assign it to the original data

df.loc['x'] = df.loc['x'].assign(C4=series)

yields

         C3
C1 C2      
x  a    NaN
   b    NaN

I get same result if I use assignment like this:

df.loc['x', 'C4'] = series

But it yields NaN.

         C3  C4
C1 C2          
x  a    NaN NaN
   b    NaN NaN
y  a   12.0 NaN
   b   13.0 NaN
z  a   14.0 NaN
   b   15.0 NaN

How can I assign in this way?

like image 505
The Data Scientician Avatar asked Mar 06 '23 23:03

The Data Scientician


2 Answers

You can go for pd.IndexSlice i.e

df.loc[pd.IndexSlice['x',series.index.tolist()],'C4']  = series.values

       C3     C4
C1 C2           
x  a   10  100.0
   b   11    NaN
y  a   12    NaN
   b   13    NaN
z  a   14    NaN
   b   15    NaN
like image 198
Bharath Avatar answered Mar 11 '23 21:03

Bharath


I would like to assign this series to a new column, C4, only to the 'x' first index.

One way is to map your series from a level of your index. The key method is pd.Index.get_level_values. Then overwrite to NaN where a mapping is not required.

f['C4'] = df.index.get_level_values(1).map(series.get)
df.loc[df.index.get_level_values(0) != 'x', 'C4'] = np.nan

print(df)

       C3     C4
C1 C2           
x  a   10  100.0
   b   11    NaN
y  a   12    NaN
   b   13    NaN
z  a   14    NaN
   b   15    NaN

Alternatively, you can use numpy.where:

df['C4'] = np.where(df.index.get_level_values(0) == 'x',
                    df.index.get_level_values(1).map(series.get),
                    np.nan)
like image 41
jpp Avatar answered Mar 11 '23 21:03

jpp