I was solving a problem to find number of such indexes, a, b, c, d
in a string s
, of size n
made only of lowercase letters such that:
1 <= a < b < c < d <= n
and
s[a] == s[c] and s[b] == s[d]
The code I wrote traverses the string character by character in a basic manner:
#include<stdio.h>
int main()
{
int n, count = 0;
char s[2002];
scanf("%d%s", &n, s);
for(int a = 0; a<n-3; a++)
{
for(int b = a + 1; b<n-2; b++)
{
for(int c = b + 1; c<n-1; c++)
{
for(int d = c + 1; d<n; d++)
{
if(s[a] == s[c] && s[b] == s[d] && a>=0 && b>a && c>b && d>c && d<n)
{
count++;
}
}
}
}
}
printf("%d", count);
return 0;
}
a, b, c and d are the indices. The trouble is that if the input string is big in size, the time limit is exceeded due to the 4 nested loops. Is there any way I can improve the code to decrease the complexity?
The problem statement is available here: https://www.hackerearth.com/practice/algorithms/searching/linear-search/practice-problems/algorithm/holiday-season-ab957deb/
The problem can be solved if you maintain an array which stores the cumulative frequency (the total of a frequency and all frequencies so far in a frequency distribution) of each character in the input string. Since the string will only consist of lower case characters, hence the array size will be [26][N+1]
.
For example:
index - 1 2 3 4 5
string - a b a b a
cumulativeFrequency array:
0 1 2 3 4 5
a 0 1 1 2 2 3
b 0 0 1 1 2 2
I have made the array by taking the index of first character of the input string as 1. Doing so will help us in solving the problem later. For now, just ignore column 0 and assume that the string starts from index 1 and not 0.
Using cumulative frequency array we can easily check if a character is present at any index i:
if cumulativeFrequency[i]-cumulativeFrequency[i-1] > 0
number of times a character is present from range i to j (excluding both i and j):
frequency between i and j = cumulativeFrequency[j-1] - cumulativeFrequency[i]
1: for each character from a-z:
2: Locate index a and c such that charAt[a] == charAt[c]
3: for each pair (a, c):
4: for character from a-z:
5: b = frequency of character between a and c
6: d = frequency of character after c
7: count += b*d
Line 1-2:
The outer most loop will run for 26 times. We need to locate all the pair(a, c), to do that we require a time complexity of O(n^2).
Line 3-4:
For each pair, we again run a loop 26 times to check how many times each character is present between a and c and after c.
Line 5-7:
Using cumulative frequency array, for each character we can easily calculate how many times it appears between a and c and after c in O(1).
Hence, overall complexity is O(26*n^2*26)
= O(n^2)
.
I code in Java. I do not have a code in C. I have used simple loops an array so it should be easy to understand.
//Input N and string
//Do not pay attention to the next two lines since they are basically taking
//input using Java input streams
int N = Integer.parseInt(bufferedReader.readLine().trim());
String str = bufferedReader.readLine().trim();
//Construct an array to store cumulative frequency of each character in the string
int[][] cumulativeFrequency = new int[26][N+1];
//Fill the cumulative frequency array
for (int i = 0;i < str.length();i++)
{
//character an index i
char ch = str.charAt(i);
//Fill the cumulative frequency array for each character
for (int j = 0;j < 26;j++)
{
cumulativeFrequency[j][i+1] += cumulativeFrequency[j][i];
if (ch-97 == j) cumulativeFrequency[j][i+1]++;
}
}
int a, b, c, d;
long count = 0;
//Follow the steps of the algorithm here
for (int i = 0;i < 26;i++)
{
for (int j = 1; j <= N - 2; j++)
{
//Check if character at i is present at index j
a = cumulativeFrequency[i][j] - cumulativeFrequency[i][j - 1];
if (a > 0)
{
//Check if character at i is present at index k
for (int k = j + 2; k <= N; k++)
{
c = cumulativeFrequency[i][k] - cumulativeFrequency[i][k - 1];
if (c > 0)
{
//For each character, find b*d
for (int l = 0; l < 26; l++)
{
//For each character calculate b and d
b = cumulativeFrequency[l][k-1] - cumulativeFrequency[l][j];
d = cumulativeFrequency[l][N] - cumulativeFrequency[l][k];
count += b * d;
}
}
}
}
}
}
System.out.println(count);
I hope I have helped you. The code I provided will not give time complexity error and it will work for all test cases. Do comment if you do not understand anything in my explanation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With