im new to python and programming, and its not easy for me to get that stuff in my mind. because the books i started to read are completely boring, i startet to play around with some ideas.
here is what i want to do: open the textfile, count the frequency of every single value (just a list of systemnames), sort the list by frequency, and return the result. after searching the web for some code to do it, i got this here:
file = open('C:\\Temp\\Test2.txt', 'r')
text = file.read()
file.close()
word_list = text.lower().split(None)
word_freq = {}
for word in word_list:
word_freq[word] = word_freq.get(word, 0) + 1
list = sorted(word_freq.keys())
for word in list:
print ("%-10s %d" % (word, word_freq[word]))
It works, but it sorts by the words / systemnames in the list:
pc05010 3
pc05012 1
pc05013 8
pc05014 2
I want it like that:
pc05013 8
pc05010 3
pc05014 2
pc05012 1
now im searching for the sort-by-value function for hours. i bet its so easy, but i found nothing.
for my beginners point of view, it has something to do with this line:
list = sorted(word_freq.keys())
i thought maybe its:
list = sorted(word_freq.values())
but no.... its very frustrating to me to see all the tons of information about this language, but could not get such simple things to work.
please help :)
thanks a lot!
Problems associated with sorting and removal of duplicates is quite common in development domain and general coding as well.
You've to use word_freq.items()
here:
lis = sorted(word_freq.items(), key = lambda x:x[1], reverse = True)
for word,freq in lis:
print ("%-10s %d" % (word, freq))
Don't use list
as a variable name.
Take a look at collections.Counter
>>> wordlist = ['foo', 'bar', 'foo', 'baz']
>>> import collections
>>> counter = collections.Counter(wordlist)
>>> counter.most_common()
[('foo', 2), ('baz', 1), ('bar', 1)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With