Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# - Code optimization to get all substrings from a string

I was working on a code snippet to get all substrings from a given string.

Here is the code that I use

 var stringList = new List<string>();
 for (int length = 1; length < mainString.Length; length++)
 {
    for (int start = 0; start <= mainString.Length - length; start++)
    {
       var substring = mainString.Substring(start, length);
       stringList.Add(substring);
    }
 }

It looks not so great to me, with two for loops. Is there any other way that I can achieve this with better time complexity.

I am stuck on the point that, for getting a substring, I will surely need two loops. Is there any other way I can look into ?

like image 280
Praneet Nadkar Avatar asked Mar 20 '18 05:03

Praneet Nadkar


3 Answers

The number of substrings in a string is O(n^2), so one loop inside another is the best you can do. You are correct in your code structure.

Here's how I would've phrased your code:

void Main()
{
    var stringList = new List<string>();
    string s = "1234";
    for (int i=0; i <s.Length; i++)
        for (int j=i; j < s.Length; j++)
            stringList.Add(s.Substring(i,j-i+1));
}
like image 175
Phillip Ngan Avatar answered Oct 08 '22 17:10

Phillip Ngan


You do need 2 for loops

Demo here

var input = "asd sdf dfg";
var stringList = new List<string>();

for (int i = 0; i < input.Length; i++)
{
    for (int j = i; j < input.Length; j++)
    {
        var substring = input.Substring(i, j-i+1);
        stringList.Add(substring);
    }
}

foreach(var item in stringList)
{
    Console.WriteLine(item);
}

Update

You cannot improve on the iterations.

However you can improve performance, by using fixed arrays and pointers

like image 38
TheGeneral Avatar answered Oct 08 '22 18:10

TheGeneral


In some cases you can significantly increase execution speed by reducing object allocations. In this case by using a single char[] and ArraySegment<of char> to process substrings. This will also lead to use of less address space and decrease in garbage collector load.

Relevant excerpt from Using the StringBuilder Class in .NET page on Microsoft Docs:

The String object is immutable. Every time you use one of the methods in the System.String class, you create a new string object in memory, which requires a new allocation of space for that new object. In situations where you need to perform repeated modifications to a string, the overhead associated with creating a new String object can be costly.

Example implementation:

static List<ArraySegment<char>> SubstringsOf(char[] value)
{
    var substrings = new List<ArraySegment<char>>(capacity: value.Length * (value.Length + 1) / 2 - 1);
    for (int length = 1; length < value.Length; length++)
        for (int start = 0; start <= value.Length - length; start++)
            substrings.Add(new ArraySegment<char>(value, start, length));
    return substrings;
}

For more information check Fundamentals of Garbage Collection page on Microsoft Docs, what is the use of ArraySegment class? discussion on StackOverflow, ArraySegment<T> Structure page on MSDN and List<T>.Capacity page on MSDN.

like image 40
Leonid Vasilev Avatar answered Oct 08 '22 17:10

Leonid Vasilev