How to perform case insensitive search in polars python,
pdf.filter(pl.col("REFERENCE_2").str.contains("search string"))
want to search string "Search String or Search string"
similar like pandas
pdf.filter(pl.col("REFERENCE_2").str.contains("search string", case=False))
.str.contains() takes a valid (Rust) regex pattern.
(?i) can be used in a pattern to enable case-insensitive matching.
df = pl.DataFrame({
"col": ["seARch StRInG", "search string", "search", "string"]
})
df.with_columns(
contains = pl.col("col").str.contains(r"(?i)search string")
)
shape: (4, 2)
┌───────────────┬──────────┐
│ col ┆ contains │
│ --- ┆ --- │
│ str ┆ bool │
╞═══════════════╪══════════╡
│ seARch StRInG ┆ true │
│ search string ┆ true │
│ search ┆ false │
│ string ┆ false │
└───────────────┴──────────┘
(?-i) can be used to disable it if you need to apply only to a specific part of the pattern.
df.with_columns(
contains = pl.col("col").str.contains("(?i)SEARCH(?-i) string")
)
shape: (4, 2)
┌───────────────┬──────────┐
│ col ┆ contains │
│ --- ┆ --- │
│ str ┆ bool │
╞═══════════════╪══════════╡
│ seARch StRInG ┆ false │
│ search string ┆ true │
│ search ┆ false │
│ string ┆ false │
└───────────────┴──────────┘
Alternatively, if you do not need regex matching there is .str.contains_any() which has the ascii_case_insensitive parameter.
df.with_columns(
pl.col("col").str.contains_any(
["search string"],
ascii_case_insensitive=True
)
.alias("contains_any")
)
shape: (4, 2)
┌───────────────┬──────────────┐
│ col ┆ contains_any │
│ --- ┆ --- │
│ str ┆ bool │
╞═══════════════╪══════════════╡
│ seARch StRInG ┆ true │
│ search string ┆ true │
│ search ┆ false │
│ string ┆ false │
└───────────────┴──────────────┘
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With