Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In python-polars, how to search string across multiple columns, and create a new column of flag if string found in any of columns?

To search over multiple columns, and create a new column of flag if string found, the following codes work, but is there any compact way inside with_columns() to achieve the same?

df = pl.DataFrame({
    "col1": ["hello", "world", "polars"],
    "col2": ["data", "science", "hello"],
    "col3": ["test", "string", "match"],
    "col4": ["hello", "example", "test"]
})


search_string = "hello"

condition = pl.lit(False)


for col in df.columns:
    condition |= pl.col(col).str.contains(search_string)

df = df.with_columns(
    condition.alias("string_found") + 0
)


print(df)

shape: (3, 5)
┌────────┬─────────┬────────┬─────────┬──────────────┐
│ col1   ┆ col2    ┆ col3   ┆ col4    ┆ string_found │
│ ---    ┆ ---     ┆ ---    ┆ ---     ┆ ---          │
│ str    ┆ str     ┆ str    ┆ str     ┆ i32          │
╞════════╪═════════╪════════╪═════════╪══════════════╡
│ hello  ┆ data    ┆ test   ┆ hello   ┆ 1            │
│ world  ┆ science ┆ string ┆ example ┆ 0            │
│ polars ┆ hello   ┆ match  ┆ test    ┆ 1            │
└────────┴─────────┴────────┴─────────┴──────────────┘
like image 297
Fred Avatar asked Sep 16 '25 23:09

Fred


1 Answers

You can use .any_horizontal()

df.with_columns(
    pl.any_horizontal(pl.all().str.contains(search_string))
      .alias("string_found")
)
shape: (3, 5)
┌────────┬─────────┬────────┬─────────┬──────────────┐
│ col1   ┆ col2    ┆ col3   ┆ col4    ┆ string_found │
│ ---    ┆ ---     ┆ ---    ┆ ---     ┆ ---          │
│ str    ┆ str     ┆ str    ┆ str     ┆ bool         │
╞════════╪═════════╪════════╪═════════╪══════════════╡
│ hello  ┆ data    ┆ test   ┆ hello   ┆ true         │
│ world  ┆ science ┆ string ┆ example ┆ false        │
│ polars ┆ hello   ┆ match  ┆ test    ┆ true         │
└────────┴─────────┴────────┴─────────┴──────────────┘

You can replace pl.all() with pl.col(pl.String) to limit the expression to String columns only.

In this example you only have String columns so it doesn't come into play.

like image 117
jqurious Avatar answered Sep 19 '25 14:09

jqurious