Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sort list by frequency-value in python [duplicate]

im new to python and programming, and its not easy for me to get that stuff in my mind. because the books i started to read are completely boring, i startet to play around with some ideas.

here is what i want to do: open the textfile, count the frequency of every single value (just a list of systemnames), sort the list by frequency, and return the result. after searching the web for some code to do it, i got this here:

file = open('C:\\Temp\\Test2.txt', 'r')
text = file.read()
file.close()


word_list = text.lower().split(None)

word_freq = {}

for word in word_list:

    word_freq[word] = word_freq.get(word, 0) + 1
list = sorted(word_freq.keys())
for word in list:
    print ("%-10s %d" % (word, word_freq[word]))

It works, but it sorts by the words / systemnames in the list:

pc05010    3
pc05012    1
pc05013    8
pc05014    2

I want it like that:

pc05013    8
pc05010    3
pc05014    2
pc05012    1

now im searching for the sort-by-value function for hours. i bet its so easy, but i found nothing.

for my beginners point of view, it has something to do with this line:

list = sorted(word_freq.keys())

i thought maybe its:

list = sorted(word_freq.values())

but no.... its very frustrating to me to see all the tons of information about this language, but could not get such simple things to work.

please help :)

thanks a lot!

like image 996
Fabster Avatar asked May 25 '13 12:05

Fabster


People also ask

Does sort function remove duplicates in Python?

Problems associated with sorting and removal of duplicates is quite common in development domain and general coding as well.


2 Answers

You've to use word_freq.items() here:

lis = sorted(word_freq.items(), key = lambda x:x[1], reverse = True)
for word,freq in lis:
    print ("%-10s %d" % (word, freq))

Don't use list as a variable name.

like image 113
Ashwini Chaudhary Avatar answered Oct 11 '22 19:10

Ashwini Chaudhary


Take a look at collections.Counter

>>> wordlist = ['foo', 'bar', 'foo', 'baz']
>>> import collections
>>> counter = collections.Counter(wordlist)
>>> counter.most_common()
[('foo', 2), ('baz', 1), ('bar', 1)]
like image 33
Blubber Avatar answered Oct 11 '22 17:10

Blubber