Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Case-insensitive string matching in Rust

Tags:

string

rust

Is there a simple way to use str::matches case-insensitively?

like image 413
Ala Douagi Avatar asked Nov 15 '17 02:11

Ala Douagi


People also ask

How do you compare strings case insensitive?

Comparing strings in a case insensitive manner means to compare them without taking care of the uppercase and lowercase letters. To perform this operation the most preferred method is to use either toUpperCase() or toLowerCase() function.

Is string match case sensitive?

A string column is case sensitive or not depending on the column's type. The CHAR and VARCHAR types are not case sensitive by default, but may be declared as BINARY to make them case sensitive.

What is a case insensitive string?

In computers, case sensitivity defines whether uppercase and lowercase letters are treated as distinct (case-sensitive) or equivalent (case-insensitive).

How do you compare strings in Rust?

One of the common operations on strings is comparison. We can use the eq(), eq_ignore_ascii_case() and == to compare strings in Rust.


2 Answers

You can always convert both strings to the same casing. This will work for some cases:

let needle = "μτς";
let haystack = "ΜΤΣ";

let needle = needle.to_lowercase();
let haystack = haystack.to_lowercase();

for i in haystack.matches(&needle) {
    println!("{:?}", i);
}

See also str::to_ascii_lowercase for ASCII-only variants.

In other cases, the regex crate might do enough case-folding (potentially Unicode) for you:

use regex::RegexBuilder; // 1.4.3

fn main() {
    let needle = "μτς";
    let haystack = "ΜΤΣ";

    let needle = RegexBuilder::new(needle)
        .case_insensitive(true)
        .build()
        .expect("Invalid Regex");

    for i in needle.find_iter(haystack) {
        println!("{:?}", i);
    }
}

However, remember that ultimately Rust's strings are UTF-8. Yes, you need to deal with all of UTF-8. This means that picking upper- or lower-case might change your results. Likewise, the only correct way to change text casing requires that you know the language of the text; it's not an inherent property of the bytes. Yes, you can have strings which contain emoji and other exciting things beyond the Basic Multilingual Plane.

See also:

  • How can I case fold a string in Rust?
  • Why is capitalizing the first letter of a string so convoluted in Rust?
like image 129
Shepmaster Avatar answered Oct 10 '22 09:10

Shepmaster


If you're using the regex crate, you can make the pattern case insensitive:

let re = Regex::new(r"(?i)μτς").unwrap();
let mat = re.find("ΜΤΣ").unwrap();
like image 42
Rumpelstiltskin Koriat Avatar answered Oct 10 '22 07:10

Rumpelstiltskin Koriat