Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting a nesting list by the first item -- itemgetter not doing the trick

I have a dictionary that I've converted to a list so I can sort by the first item. The key in the dictionary is a string (of numbers), the value is an integer which is maintained in the list.
The list from the dictionary conversion looks like:

[('228055', 1), ('228054', 1), ('228057', 2), ('228056', 1), ('228051', 1), ('228050', 1),     ('228053', 1), ('203184', 6), ('228059', 1), ('228058', 1), ('89370', 2), ('89371', 3), ('89372', 2), ('89373', 1), ('89374', 1), ('89375', 1), ('89376', 1), ('89377', 1), ('89378', 1), ('89379', 1),.........]

There are around 240,000 items in the dictionary. I would like to sort the dictionary by the first index, but when I use itemgetter(0) it sorts the list by all the "1's" first. The sorted listed looks like:

[('0', 3), ('1', 3), ('10', 3), ('100', 4), ('1000', 3), ('10000', 1), ('100000', 3), ('100001', 2), ('100002', 3), ('100003', 3), ('100004', 2), ('100005', 2), ('100006', 2), ('100007', 2), ('100008', 2), ('100009', 2), ('10001', 1), ('100010', 3), ('100011', 3), ('100012', 3), ('100013', 2), ('100014', 1), ('100015', 1), ('100016', 1), ('100017', 1), ('100018', 1), ....]

I would like the list to be sorted by ['0', 3), ('1', 3), ('2', integer), ('3', integer),...('240,000', integer)]

Here's my code where I'm reading in a text file to a dictionary, converting to a list and used itemgetter to sort by first item in nested list. I need the dictionary in the code because I heavily depend on it to look up values by the key. I'm only trying to sort the dictionary for the output file once all the processes are ran. Thanks for any help.

import sys, string, csv, arcpy, os, fileinput, traceback
from arcpy import env
from operator import itemgetter


#Creating a dictionary of FID: LU_Codes from external txt file
text_file = open("H:\SWAT\NC\FID_Whole_Copy.txt", "rb")
#Lines = text_file.readlines()
FID_GC_dict =  dict()
reader = csv.reader(text_file, delimiter='\t')
for line in reader:
    FID_GC_dict[line[0]] = int(line[1])
text_file.close()

dict_List = [(x, FID_GC_dict[x]) for x in FID_GC_dict.keys()]
dict_List.sort(key=itemgetter(0))
print dict_List
like image 538
Linda Avatar asked Feb 25 '12 18:02

Linda


2 Answers

That's because they're strings.

key=lambda x: int(x[0])
like image 174
Ignacio Vazquez-Abrams Avatar answered Oct 26 '22 13:10

Ignacio Vazquez-Abrams


Changing the key to convert the string to an int will help you, also here are some other sorting tips.

from operator import itemgetter

list_to_sort=[('89372', 2), ('89373', 1), ('89374', 1), ('89375', 1), ('89376', 1),     ('89377', 1), ('228055', 1), ('228054', 1), ('228057', 2), ('228056', 1), ('228051', 1), ('228050', 1),('228053', 1), ('203184', 6), ('228059', 1), ('228058', 1), ('89370', 2), ('89371', 3), ('89372', 2), ('89373', 1), ('89374', 1), ('89375', 1), ('89376', 1), ('89377', 1)]
print list_to_sort

list_to_sort.sort()
print list_to_sort # badly sorted as described

list_to_sort.sort(key=itemgetter(0))
print list_to_sort # badly sorted as described (same as above)

list_to_sort.sort(key=lambda x: int(x[0]))
print list_to_sort # sorted well

list_to_sort.sort(key=lambda x: int(x[0]), reverse=True)
print list_to_sort # sorted well in reverse

Side note on building the list to sort from the dict. iteritems() is a nicer way of doing what you do with the following

dict_List = [(x, FID_GC_dict[x]) for x in FID_GC_dict.keys()]

dict_List = [(k,v) for k,v in FID_GC_dict.iteritems()]
like image 20
Matt Alcock Avatar answered Oct 26 '22 11:10

Matt Alcock