Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to count char occurences in a string

Hello I am trying to write these python lines in a single line but getting some errors due to the dictionary modifications the code is doing.

for i in range(len(string)):
    if string[i] in dict:
        dict[string[i]] += 1

the general syntax I believe is

abc = [i for i in len(x) if x[i] in array]

Would it be possible for someone to tell me how this might work considering that I am adding 1 to the value in a dictionary

Thanks

like image 233
Kartik Avatar asked Jan 21 '12 09:01

Kartik


1 Answers

What you're trying to do can be done with dict, a generator expression and str.count():

abc = dict((c, string.count(c)) for c in string)

Alternative using set(string) (from a comment down below by soulcheck):

abc = dict((c, string.count(c)) for c in set(string))

Timing

Seen the comments down below I performed a little testing among this and other answers. (with python-3.2)

Test functions:

@time_me
def test_dict(string, iterations):
    """dict((c, string.count(c)) for c in string)"""
    for i in range(iterations):
        dict((c, string.count(c)) for c in string)

@time_me
def test_set(string, iterations):
    """dict((c, string.count(c)) for c in set(string))"""
    for i in range(iterations):
        dict((c, string.count(c)) for c in set(string))

@time_me
def test_counter(string, iterations):
    """Counter(string)"""
    for i in range(iterations):
        Counter(string)

@time_me
def test_for(string, iterations, d):
    """for loop from cha0site"""
    for i in range(iterations):
        for c in string:
            if c in d:
                d[c] += 1

@time_me
def test_default_dict(string, iterations):
    """defaultdict from joaquin"""
    for i in range(iterations):
        mydict = defaultdict(int)
        for mychar in string:
            mydict[mychar] += 1

Test execution:

d_ini = dict((c, 0) for c in string.ascii_letters)
words = ['hand', 'marvelous', 'supercalifragilisticexpialidocious']

for word in words:
    print('-- {} --'.format(word))
    test_dict(word, 100000)
    test_set(word, 100000)
    test_counter(word, 100000)
    test_for(word, 100000, d_ini)
    test_default_dict(word, 100000)
    print()

print('-- {} --'.format('Pride and Prejudcie - Chapter 3 '))

test_dict(ch, 1000)
test_set(ch, 1000)
test_counter(ch, 1000)
test_for(ch, 1000, d_ini)
test_default_dict(ch, 1000)

Test results:

-- hand --
389.091 ms -  dict((c, string.count(c)) for c in string)
438.000 ms -  dict((c, string.count(c)) for c in set(string))
867.069 ms -  Counter(string)
100.204 ms -  for loop from cha0site
241.070 ms -  defaultdict from joaquin

-- marvelous --
654.826 ms -  dict((c, string.count(c)) for c in string)
729.153 ms -  dict((c, string.count(c)) for c in set(string))
1253.767 ms -  Counter(string)
201.406 ms -  for loop from cha0site
460.014 ms -  defaultdict from joaquin

-- supercalifragilisticexpialidocious --
1900.594 ms -  dict((c, string.count(c)) for c in string)
1104.942 ms -  dict((c, string.count(c)) for c in set(string))
2513.745 ms -  Counter(string)
703.506 ms -  for loop from cha0site
935.503 ms -  defaultdict from joaquin

# !!!: Do not compare this last result with the others because is timed
#      with 1000 iterations instead of 100000
-- Pride and Prejudcie - Chapter 3  --
155315.108 ms -  dict((c, string.count(c)) for c in string)
982.582 ms -  dict((c, string.count(c)) for c in set(string))
4371.579 ms -  Counter(string)
1609.623 ms -  for loop from cha0site
1300.643 ms -  defaultdict from joaquin
like image 196
Rik Poggi Avatar answered Sep 21 '22 13:09

Rik Poggi