Consider the following problem statement:
Given an unsorted array of integers, find a subarray which adds to a given number. If there are more than one subarrays with sum as the given number, print any of them.
Examples:
Input: arr[] = {1, 4, 20, 3, 10, 5}, sum = 33
Ouptut: Sum found between indexes 2 and 4
Input: arr[] = {10, 2, -2, -20, 10}, sum = -10
Ouptut: Sum found between indexes 0 to 3
Input: arr[] = {-10, 0, 2, -2, -20, 10}, sum = 20
Ouptut: No subarray with given sum exists
On this site, the following linear time solution was suggested involving the use of a map to store the sums of the current subsets as the algorithm iterates through the array:
// Function to print subarray with sum as given sum
void subArraySum(int arr[], int n, int sum)
{
// create an empty map
unordered_map<int, int> map;
// Maintains sum of elements so far
int curr_sum = 0;
for (int i = 0; i < n; i++)
{
// add current element to curr_sum
curr_sum = curr_sum + arr[i];
// if curr_sum is equal to target sum
// we found a subarray starting from index 0
// and ending at index i
if (curr_sum == sum)
{
cout << "Sum found between indexes "
<< 0 << " to " << i << endl;
return;
}
// If curr_sum - sum already exists in map
// we have found a subarray with target sum
if (map.find(curr_sum - sum) != map.end())
{
cout << "Sum found between indexes "
<< map[curr_sum - sum] + 1
<< " to " << i << endl;
return;
}
map[curr_sum] = i;
}
// If we reach here, then no subarray exists
cout << "No subarray with given sum exists";
}
// Driver program to test above function
int main()
{
int arr[] = {10, 2, -2, -20, 10};
int n = sizeof(arr)/sizeof(arr[0]);
int sum = -10;
subArraySum(arr, n, sum);
return 0;
}
With the test case that is provided in the post, the second if statement checking whether current_sum - sum is never entered. Instead, the sum is found and is printed in the first if statement. So there a few points of confusion here:
1: What is the purpose of looking up current_sum-sum
in the map?
2: In which cases would the second if statement even be entered to put the map to use in solving the problem?
Find if there is a subarray with 0 sum using hashing: The idea is to iterate through the array and for every element arr[i], calculate the sum of elements from 0 to i (this can simply be done as sum += arr[i]). If the current sum has been seen before, then there is a zero-sum array.
Naive Approach: Consider the sum of all the sub-arrays and return the length of the longest sub-array having the sum 'k'. Time Complexity is of O(n^2). Implementation: C++
Iterate through the array. At each element add the next n elements one by one, when the sum equals to the required value print the sub array.
curr_sum = curr_sum + arr[i];
Here:
curr_sum holds the sum of all the entries(up to arr[i]) starting from arr[0]
if (curr_sum == sum)
{
cout << "Sum found between indexes "
<< 0 << " to " << i << endl;
return;
}
Here:
The if condition checks if the curr_sum is equal to the given sum. Remember, curr_sum holds the sum starting from arr[0].
So, if the indexes for your result subarray is 2 and 4, the above condition is not enough.
Input: arr[] = {1, 4, 20, 3, 10, 5}, sum = 33
Ouptut: Sum found between indexes 2 and 4
In that example,
when i = 4, curr_sum would be 38. But if you take 1 and 4 out(arr[0] and arr[1] which is a subarray), you get 33.
if (map.find(curr_sum - sum) != map.end())
{
cout << "Sum found between indexes "
<< map[curr_sum - sum] + 1
<< " to " << i << endl;
return;
}
map[curr_sum] = i;
The above code does that.
map[curr_sum] = i;
map stores the index value (i) with the subarray sum ( sum of the arr between indexes 0 and i ) as key. So for the example, it would be:
This code:
map.find(curr_sum - sum)
searches the map is if it has an entry with key (curr_sum - sum).
In our example, when i = 4, curr_sum would be 38 and sum as given would be 33. If we eliminate the subarray arr[0,2], we get 33. So, map.find(curr_sum - sum) => map.find(38-33) is a hit since we have entry map[5] = 1.
Hope this clears you
There are two cases:
The sum from indices 0 to n
is the number we're looking for. cur_sum
is the running sum of all the indices, so it's as easy as just checking that variable.
The sum from indices m
to n
is the number we're looking for. If that is the case, then there exists an m
such that the (sum from 0 to n
) minus (sum from 0 to m
) is the sum we're looking for. All those intermediate sums are stored in the map, so that's why we're looking for cur_sum - sum
- if that difference corresponds to some partial sum, then we're done.
An example of this second case would be the first test case in your question - looking for 33 in {1, 4, 20, 3, 10, 5}
. At n==4
, we have cur_sum
of 38
. But 5
is in the map as being the sum of the first two indices.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With