Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas 2.1.0 FutureWarning: Series.__getitem__ treating keys as positions is deprecated

I'm having an issue with Pandas v2.1.0+ that I can't figure out.

I have a list of columns in my pandas data frame that I need to convert using a custom function. The new values depend on multiple columns in the data, so I'm using apply to convert the column in-place:

my_columns_to_convert = ['col1','col2','col3']

for k in my_columns_to_convert:
  df[k] = df[[k,colx]].apply(lambda x: convert_my_data(value_1_in=x[0],value_2_in=x[1]),axis=1)

This has worked just fine in previous versions of pandas. But now I get:

FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

But I'm not using loc or iloc, and everything I've reviewed thus far seems to point at that being the issue. How can i write this code so that I'm doing it the 'correct' way in the future?

Using previous methods in Pandas that did work.

like image 821
Zach Morris Avatar asked Sep 12 '25 14:09

Zach Morris


1 Answers

This FutureWarning can be triggered in 2.1.0 with this simple example :

ser = pd.Series({"A": "a", "B": "b", "C": "c"})

# A    a
# B    b
# C    c
# dtype: object

print(ser[1]) # gives 'b' but with a FutureWarning: Series.__getitem__ treating keys..

The goal is to have a consistent behaviour when [ ]-indexing a DataFrame as well as a Series. Remember that df[1] does not return the column located at the second position of that DataFrame and will trigger a KeyError (unless the literal 0 is an actual column and in this case, the column 0 will be returned).


So based on your code, your df (see how I imagine it below) most likely hasn't a default index (i.e a range of integers or at least a list of integers). So when slicing each Series here x[0], x[1] while the indices are strings ["A", "B", "C"], you're warned by pandas to use x.iloc[0] and x.iloc[1] instead.

my_columns_to_convert = ['col1', 'col2', 'col3']

df = pd.DataFrame(
    np.arange(12).reshape(-1, 4),
    index=list("ABC"), columns= my_columns_to_convert + ["colx"]
)

#    col1  col2  col3  colx
# A     0     3     6     3
# B    28    35    42     7
# C    88    99   110    11

def convert_my_data(value_1_in, value_2_in):
    return value_1_in * value_2_in # a simple calculation

for k in my_columns_to_convert:
    df[k] = (
        df[[k, "colx"]].apply(
            lambda x: convert_my_data(value_1_in=x[0], value_2_in=x[1]), axis=1)
    )

# the FutureWarning is displayed three times (= the length of the Series) :

FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use ser.iloc[pos]: lambda x: convert_my_data(value_1_in=x[0], value_2_in=x[1]), axis=1)

As a side note, your code seems to be not efficient and can potentially be easily vectorized.

like image 57
Timeless Avatar answered Sep 15 '25 04:09

Timeless