Say, I have a string
s = 'AAABBBCAB'
How can I efficiently calculate the prefix sum of frequencies of each character in the string, i.e.:
psum = [{'A': 1}, {'A': 2}, {'A': 3}, {'A': 3, 'B': 1}, {'A': 3, 'B': 2}, {'A': 3, 'B': 3}, {'A': 3, 'B': 3, 'C': 1}, {'A': 4, 'B': 3, 'C': 1}, {'A': 4, 'B': 4, 'C': 1}]
Freq will be used to maintain the count of each character present in the string. Now, iterate through the string to compare each character with rest of the string. Increment the count of corresponding element in freq. Finally, iterate through freq to display the frequencies of characters.
Its main idea uses prefix sums which are defined as the consecutive totals of the first 0,1,2,...,n elements of an array. We can easily calculate the prefix sums in O(n) time complexity. Notice that the total pk equals pk−1 + ak−1 , so each consecutive value can be calculated in a constant time.
The code snippet for this is given as follows. for(int i = 0; str[i] != '\0'; i++) { if(str[i] == c) count++; } cout<<"Frequency of alphabet "<<c<<" in the string is "<<count; The program to find the frequency of all the alphabets in the string is given as follows.
Yes, prefix sums can be considered as a form of Dynamic Programming. It is the simplest way to calculate the sum of a range given a static array by using a prefix array which stores data based on previous sums.
You can do it in one line using itertools.accumulate
and collections.Counter
:
from collections import Counter from itertools import accumulate s = 'AAABBBCAB' psum = list(accumulate(map(Counter, s)))
This gives you a list of Counter
objects. Now, to get frequencies for any substring of s
in O(1) time, you can simply subtract counters, e.g.:
>>> psum[6] - psum[1] # get frequencies for s[2:7] Counter({'B': 3, 'A': 1, 'C': 1})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With