Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding all positions of substring in a larger string in C#

I have a large string I need to parse, and I need to find all the instances of extract"(me,i-have lots. of]punctuation, and store the index of each to a list.

So say this piece of string was in the beginning and middle of the larger string, both of them would be found, and their indexes would be added to the List. and the List would contain 0 and the other index whatever it would be.

I've been playing around, and the string.IndexOf does almost what I'm looking for, and I've written some code - but it's not working and I've been unable to figure out exactly what is wrong:

List<int> inst = new List<int>(); int index = 0; while (index < source.LastIndexOf("extract\"(me,i-have lots. of]punctuation", 0) + 39) {     int src = source.IndexOf("extract\"(me,i-have lots. of]punctuation", index);     inst.Add(src);     index = src + 40; } 
  • inst = The list
  • source = The large string

Any better ideas?

like image 643
caesay Avatar asked Apr 14 '10 21:04

caesay


People also ask

How do you find all occurrences of a substring?

Use the string. count() Function to Find All Occurrences of a Substring in a String in Python. The string. count() is an in-built function in Python that returns the quantity or number of occurrences of a substring in a given particular string.

Which string method returns the index position of a substring in a string in C?

ptr = strstr(sntnc,word);

How do you find the index of all occurrences of an element in a string?

Using indexOf() and lastIndexOf() method The String class provides an indexOf() method that returns the index of the first appearance of a character in a string. To get the indices of all occurrences of a character in a String, you can repeatedly call the indexOf() method within a loop.


2 Answers

Here's an example extension method for it:

public static List<int> AllIndexesOf(this string str, string value) {     if (String.IsNullOrEmpty(value))         throw new ArgumentException("the string to find may not be empty", "value");     List<int> indexes = new List<int>();     for (int index = 0;; index += value.Length) {         index = str.IndexOf(value, index);         if (index == -1)             return indexes;         indexes.Add(index);     } } 

If you put this into a static class and import the namespace with using, it appears as a method on any string, and you can just do:

List<int> indexes = "fooStringfooBar".AllIndexesOf("foo"); 

For more information on extension methods, http://msdn.microsoft.com/en-us/library/bb383977.aspx

Also the same using an iterator:

public static IEnumerable<int> AllIndexesOf(this string str, string value) {     if (String.IsNullOrEmpty(value))         throw new ArgumentException("the string to find may not be empty", "value");     for (int index = 0;; index += value.Length) {         index = str.IndexOf(value, index);         if (index == -1)             break;         yield return index;     } } 
like image 127
Matti Virkkunen Avatar answered Sep 28 '22 19:09

Matti Virkkunen


Why don't you use the built in RegEx class:

public static IEnumerable<int> GetAllIndexes(this string source, string matchString) {    matchString = Regex.Escape(matchString);    foreach (Match match in Regex.Matches(source, matchString))    {       yield return match.Index;    } } 

If you do need to reuse the expression then compile it and cache it somewhere. Change the matchString param to a Regex matchExpression in another overload for the reuse case.

like image 30
csaam Avatar answered Sep 28 '22 20:09

csaam