Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Polars Case Statement

I am trying to pick up the package polars from Python. I come from an R background so appreciate this might be an incredibly easy question.

I want to implement a case statement where if any of the conditions below are true, it will flag it to 1 otherwise it will be 0. My new column will be called 'my_new_column_flag'

I am however getting the error message

TypeError: invalid input for `col`. Expected `str` or `DataType`, got 'int'.

import polars as pl
import numpy as np

np.random.seed(12)

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)
print(df)

df.with_columns(
    pl.when(pl.col('nrs') == 1).then(pl.col(1))
    .when(pl.col('names') == 'ham').then(pl.col(1))
    .when(pl.col('random') == 0.014575).then(pl.col(1))
    .otherwise(pl.col(0))
    .alias('my_new_column_flag')
)

Can anyone help?

like image 909
John Smith Avatar asked Jun 19 '26 22:06

John Smith


1 Answers

pl.col selects a column with the given name (as string). What you want is a column with literal value set to one: pl.lit(1)

df.with_columns(
    pl.when(pl.col('nrs') == 1).then(pl.lit(1))
    .when(pl.col('names') == 'ham').then(pl.lit(1))
    .when(pl.col('random') == 0.014575).then(pl.lit(1))
    .otherwise(pl.lit(0))
    .alias('my_new_column_flag')
)

PS: it may look more natural to use predicate for your flat (and cast it to int if you want it to be 0/1 instead of true/false):


df.with_columns(
    ((pl.col("nrs") == 1) | (pl.col("names") == "ham") | (pl.col("random") == 0.014575))
    .alias("my_new_column_flag")
    .cast(int)
)
like image 76
0x26res Avatar answered Jun 21 '26 13:06

0x26res



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!