Pandas 2.1.0 FutureWarning: Series.__getitem__ treating keys as positions is deprecated

Question

I'm having an issue with Pandas v2.1.0+ that I can't figure out.

I have a list of columns in my pandas data frame that I need to convert using a custom function. The new values depend on multiple columns in the data, so I'm using apply to convert the column in-place:

my_columns_to_convert = ['col1','col2','col3']

for k in my_columns_to_convert:
  df[k] = df[[k,colx]].apply(lambda x: convert_my_data(value_1_in=x[0],value_2_in=x[1]),axis=1)

This has worked just fine in previous versions of pandas. But now I get:

FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

But I'm not using loc or iloc, and everything I've reviewed thus far seems to point at that being the issue. How can i write this code so that I'm doing it the 'correct' way in the future?

Using previous methods in Pandas that did work.

Timeless · Accepted Answer

This FutureWarning can be triggered in 2.1.0 with this simple example :

ser = pd.Series({"A": "a", "B": "b", "C": "c"})

# A    a
# B    b
# C    c
# dtype: object

print(ser[1]) # gives 'b' but with a FutureWarning: Series.__getitem__ treating keys..

The goal is to have a consistent behaviour when [ ]-indexing a DataFrame as well as a Series. Remember that df[1] does not return the column located at the second position of that DataFrame and will trigger a KeyError (unless the literal 0 is an actual column and in this case, the column 0 will be returned).

So based on your code, your df (see how I imagine it below) most likely hasn't a default index (i.e a range of integers or at least a list of integers). So when slicing each Series here x[0], x[1] while the indices are strings ["A", "B", "C"], you're warned by pandas to use x.iloc[0] and x.iloc[1] instead.

my_columns_to_convert = ['col1', 'col2', 'col3']

df = pd.DataFrame(
    np.arange(12).reshape(-1, 4),
    index=list("ABC"), columns= my_columns_to_convert + ["colx"]
)

#    col1  col2  col3  colx
# A     0     3     6     3
# B    28    35    42     7
# C    88    99   110    11

def convert_my_data(value_1_in, value_2_in):
    return value_1_in * value_2_in # a simple calculation

for k in my_columns_to_convert:
    df[k] = (
        df[[k, "colx"]].apply(
            lambda x: convert_my_data(value_1_in=x[0], value_2_in=x[1]), axis=1)
    )

# the FutureWarning is displayed three times (= the length of the Series) :

FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use ser.iloc[pos]: lambda x: convert_my_data(value_1_in=x[0], value_2_in=x[1]), axis=1)

As a side note, your code seems to be not efficient and can potentially be easily vectorized.

Pandas 2.1.0 FutureWarning: Series.getitem treating keys as positions is deprecated

Tags:

python

pandas

dataframe

warnings

Zach Morris

1 Answers

Timeless

Recent Activity

Donate For Us