In a given string, it is easy to search for the first occurrence of a substring like this:
int position = "01234".IndexOf ("23"); // returns 2
I want to search for the first occurrence of any of multiple possible strings:
var options = new [] {"77", "34", "12"};
int position = "01234".ImaginaryIndexOf (options); // should return 1
Such a function does not seem to exist in the .NET framework. Am I missing it?
Edit: To clarify, I am looking for a way that works well even for large inputs and unevenly distributed options. Imagine something similar to
var options = new [] {"x", "y"};
new String ('x', 1000*1000).IndexOf (options)
You can use any : a_string = "A string is more than its parts!" matches = ["more", "wholesome", "milk"] if any(x in a_string for x in matches): Similarly to check if all the strings from the list are found, use all instead of any .
The basic grep syntax when searching multiple patterns in a file includes using the grep command followed by strings and the name of the file or its path. The patterns need to be enclosed using single quotes and separated by the pipe symbol. Use the backslash before pipe | for regular expressions.
Method 2: Using any() with regular expression (re) Using regular expressions, we can easily check multiple substrings in a single-line statement. We use the findall() method of the re module to get all the matches as a list of strings and pass it to any() method to get the result in True or False.
Maybe something like that;
var options = new[] { "77", "34", "12" };
var position = options.Select(x => "01234".IndexOf(x))
.Where(x => x > -1).OrderBy(x => x).DefaultIfEmpty(-1)
.FirstOrDefault();
You could define a string extension;
public static class StringExtensions
{
public static int ImaginaryIndexOf(this string str,IEnumerable<string> options)
{
return options.Select(x => str.IndexOf(x))
.Where(x => x > -1).OrderBy(x => x)
.DefaultIfEmpty(-1).FirstOrDefault();
}
}
Then;
var options = new[] { "77", "34", "12" };
"01234".ImaginaryIndexOf(options);
No built-in method that I'm aware of..
But in order to do so you can iterate all the options
and for each calculate the IndexOf
. Then retrieve the minimal that is not -1
(of "not found"):
int position = options.Select(o => "01234".IndexOf(o))
.OrderBy(i =>i).FirstOrDefault(i=> i != -1);
Or instead of sorting (which is O(nlogn)
) find minimum (O(n)
):
int position = options.Select(o => "01234".IndexOf(o))
.Where(i => i != -1).DefaultIfEmpty(-1).Min();
As for the edit what you can consider is constructing and array of suffix trees - the array contains m items where m is the distinct amount of first letters of your options
words. As a general example:
if options is: "some", "word", "something", "other"
then you'd construct:
0 1 2...
+-----------------------+
| s | w | o |
+- | ------ | ------ | -+
o o t
| | |
m r h
| | |
e d e
/ \ | |
$ t r $
| |
... $
Then you iterate your string and for each letter you check if it is in the array. If not continue to next. If it is then you can deepen in the nested tree and check next letter of string compared to next level in tree. If at the end of the main string you have not reached any of the $
then no item of options
is in the text.
Of course you can have the array as a HashSet<T>
to improve the search of the first letter of a word.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With