Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nested for-loops and dictionaries in finding value occurrence in string

I've been tasked with creating a dictionary whose keys are elements found in a string and whose values count the number of occurrences per value.

Ex.

"abracadabra" → {'r': 2, 'd': 1, 'c': 1, 'b': 2, 'a': 5}

I have the for-loop logic behind it here:

xs = "hshhsf"
xsUnique = "".join(set(xs))

occurrences = []
freq = []

counter = 0

for i in range(len(xsUnique)):
    for x in range(len(xs)):
        if xsUnique[i] == xs[x]:
            occurrences.append(xs[x])
            counter += 1
    freq.append(counter)
    freq.append(xsUnique[i])
counter = 0 

This does exactly what I want it to do, except with lists instead of dictionaries. How can I make it so counter becomes a value, and xsUnique[i] becomes a key in a new dictionary?

like image 509
ChrisAngj Avatar asked Jul 05 '15 18:07

ChrisAngj


3 Answers

The easiest way is to use a Counter:

>>> from collections import Counter
>>> Counter("abracadabra")
Counter({'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1})

If you can't use a Python library, you can use dict.get with a default value of 0 to make your own counter:

s="abracadabra"
count={}
for c in s:
    count[c] = count.get(c, 0)+1

>>> count
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}    

Or, you can use dict.fromkeys() to set all the values in a counter to zero and then use that:

>>> counter={}.fromkeys(s, 0)
>>> counter
{'a': 0, 'r': 0, 'b': 0, 'c': 0, 'd': 0}
>>> for c in s:
...    counter[c]+=1
... 
>>> counter
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}

If you truly want the least Pythonic, i.e., what you might do in C, you would maybe do:

  1. create a list for all possible ascii values set to 0
  2. loop over the string and count characters that are present
  3. Print non zero values

Example:

ascii_counts=[0]*255
s="abracadabra"

for c in s:
    ascii_counts[ord(c)]+=1

for i, e in enumerate(ascii_counts):
    if e:
        print chr(i), e 

Prints:

a 5
b 2
c 1
d 1
r 2

That does not scale to use with Unicode, however, since you would need more than 1 million list entries...

like image 66
dawg Avatar answered Nov 19 '22 22:11

dawg


You can use zip function to convert your list to dictionary :

>>> dict(zip(freq[1::2],freq[0::2]))
{'h': 3, 's': 2, 'f': 1}

But as more pythonic and pretty optimized way I suggest to use collections.Counter

>>> from collections import Counter
>>> Counter("hshhsf")
Counter({'h': 3, 's': 2, 'f': 1})

And as you said you don't want to import any module you can use a dictionary using dict.setdefault method and a simple loop:

>>> d={}
>>> for i in xs:
...    d[i]=d.setdefault(i,0)+1
... 
>>> d
{'h': 3, 's': 2, 'f': 1}
like image 1
Mazdak Avatar answered Nov 20 '22 00:11

Mazdak


I'm guessing theres a learning reason as to why your using two forloops? Anyway heres a few different solutions:

# Method 1
xs = 'hshhsf'
xsUnique = ''.join(set(xs))

freq1 = {}
for i in range(len(xsUnique)):
    for x in range(len(xs)):
        if xsUnique[i] == xs[x]:
            if xs[x] in freq1:
                freq1[xs[x]] += 1
            else:
                freq1[xs[x]] = 1 # Introduce a new key, value pair

# Method 2
# Or use a defaultdict that auto initialize new values in a dictionary
# https://docs.python.org/2/library/collections.html#collections.defaultdict

from collections import defaultdict

freq2 = defaultdict(int) # new values initialize to 0
for i in range(len(xsUnique)):
    for x in range(len(xs)):
        if xsUnique[i] == xs[x]:
            # no need to check if xs[x] is in the dict because 
            # defaultdict(int) will set any new key to zero, then
            # preforms it's operation.
            freq2[xs[x]] += 1


# I don't understand why your using 2 forloops though

# Method 3
string = 'hshhsf' # the variable name `xs` confuses me, sorry

freq3 = defaultdict(int)
for char in string:
    freq3[char] += 1

# Method 4
freq4 = {}
for char in string:
    if char in freq4:
        freq4[char] += 1
    else:
        freq4[char] = 1



print 'freq1: %r\n' % freq1
print 'freq2: %r\n' % freq2
print 'freq3: %r\n' % freq3
print 'freq4: %r\n' % freq4

print '\nDo all the dictionaries equal each other as they stand?'
print 'Answer: %r\n\n'  % (freq1 == freq2 and freq1 == freq3 and freq1 == freq4)

# convert the defaultdict's to a dict for consistency
freq2 = dict(freq2)
freq3 = dict(freq3)

print 'freq1: %r' % freq2
print 'freq2: %r' % freq2
print 'freq3: %r' % freq3
print 'freq4: %r' % freq4

Output

freq1: {'h': 3, 's': 2, 'f': 1}
freq2: defaultdict(<type 'int'>, {'h': 3, 's': 2, 'f': 1})
freq3: defaultdict(<type 'int'>, {'h': 3, 's': 2, 'f': 1})
freq4: {'h': 3, 's': 2, 'f': 1}

Do all the dictionaries equal each other as they stand?
Answer: True


freq1: {'h': 3, 's': 2, 'f': 1}
freq2: {'h': 3, 's': 2, 'f': 1}
freq3: {'h': 3, 's': 2, 'f': 1}
freq4: {'h': 3, 's': 2, 'f': 1}
[Finished in 0.1s]

Or like dawg stated, use Counter from the collections standard library

counter docs

https://docs.python.org/2/library/collections.html#collections.Counter

defaultdict docs

https://docs.python.org/2/library/collections.html#collections.defaultdict

collections library docs

https://docs.python.org/2/library/collections.html

like image 1
Brandon Nadeau Avatar answered Nov 19 '22 23:11

Brandon Nadeau