Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Search multiple strings in string

Tags:

string

c#

In a given string, it is easy to search for the first occurrence of a substring like this:

int position = "01234".IndexOf ("23"); // returns 2

I want to search for the first occurrence of any of multiple possible strings:

var options = new [] {"77", "34", "12"};
int position = "01234".ImaginaryIndexOf (options); // should return 1

Such a function does not seem to exist in the .NET framework. Am I missing it?

Edit: To clarify, I am looking for a way that works well even for large inputs and unevenly distributed options. Imagine something similar to

var options = new [] {"x", "y"};
new String ('x', 1000*1000).IndexOf (options)
like image 310
mafu Avatar asked Jan 17 '18 05:01

mafu


People also ask

How do you find multiple words in a string?

You can use any : a_string = "A string is more than its parts!" matches = ["more", "wholesome", "milk"] if any(x in a_string for x in matches): Similarly to check if all the strings from the list are found, use all instead of any .

How do I search for multiple strings in a list?

The basic grep syntax when searching multiple patterns in a file includes using the grep command followed by strings and the name of the file or its path. The patterns need to be enclosed using single quotes and separated by the pipe symbol. Use the backslash before pipe | for regular expressions.

How do I search for multiple strings in Python?

Method 2: Using any() with regular expression (re) Using regular expressions, we can easily check multiple substrings in a single-line statement. We use the findall() method of the re module to get all the matches as a list of strings and pass it to any() method to get the result in True or False.


2 Answers

Maybe something like that;

var options = new[] { "77", "34", "12" };
var position = options.Select(x => "01234".IndexOf(x))
    .Where(x => x > -1).OrderBy(x => x).DefaultIfEmpty(-1)
    .FirstOrDefault();

You could define a string extension;

public static class StringExtensions
{
    public static int ImaginaryIndexOf(this string str,IEnumerable<string> options)
    {
        return options.Select(x => str.IndexOf(x))
            .Where(x => x > -1).OrderBy(x => x)
            .DefaultIfEmpty(-1).FirstOrDefault();
    }
}

Then;

var options = new[] { "77", "34", "12" };
"01234".ImaginaryIndexOf(options);
like image 146
lucky Avatar answered Sep 29 '22 07:09

lucky


No built-in method that I'm aware of..

But in order to do so you can iterate all the options and for each calculate the IndexOf. Then retrieve the minimal that is not -1 (of "not found"):

int position = options.Select(o => "01234".IndexOf(o))
                      .OrderBy(i =>i).FirstOrDefault(i=> i != -1);

Or instead of sorting (which is O(nlogn)) find minimum (O(n)):

int position = options.Select(o => "01234".IndexOf(o))
                      .Where(i => i != -1).DefaultIfEmpty(-1).Min();

As for the edit what you can consider is constructing and array of suffix trees - the array contains m items where m is the distinct amount of first letters of your options words. As a general example:

if options is: "some", "word", "something", "other" then you'd construct:

    0        1        2...
 +-----------------------+
 |  s    |   w    |   o  |
 +- | ------ | ------ | -+
    o        o        t
    |        |        |
    m        r        h
    |        |        |
    e        d        e
   / \       |        |
  $   t      r        $
      |      |
      ...    $

Then you iterate your string and for each letter you check if it is in the array. If not continue to next. If it is then you can deepen in the nested tree and check next letter of string compared to next level in tree. If at the end of the main string you have not reached any of the $ then no item of options is in the text. Of course you can have the array as a HashSet<T> to improve the search of the first letter of a word.

like image 20
Gilad Green Avatar answered Sep 29 '22 07:09

Gilad Green