I have a txt file that contains data in the following fashion:
23 1 65 15 19.2
19 2 66 25 25.7
10 3 67 35 16.5
100 4 68 45 10.4
20 5 69 55 6.8
201 6 64 65 9.2
Within the file, each value is separated from other using \t and then \n for next line.
I want to sort this file based on the first values of the each line. My expected output is :
10 3 67 35 16.5
19 2 66 25 25.7
20 5 69 55 6.8
23 1 65 15 19.2
100 4 68 45 10.4
201 6 64 65 9.2
But the actual output I am getting is as:
10 3 67 35 16.5
100 4 68 45 10.4
19 2 66 25 25.7
20 5 69 55 6.8
201 6 64 65 9.2
23 1 65 15 19.2
Its taking the values as strings and hence not taking the entire numbers value as integer. I tried parsing, but its not working.
My code:
with open('filename.txt') as fin:
lines = [line.split() for line in fin]
lines.sort(key=itemgetter(0),reverse=True)
with open('newfile.txt', 'w') as fout:
for i in lines:
fout.write('{0}\t\t\t\t\n'.format('\t\t\t '.join(i)))
Please help if possible.
You're currently comparing strings, you need to compare integers:
lines.sort(key=lambda x:int(x[0]), reverse=True)
Strings are compared lexicographically, so:
>>> '2' > '100'
True
Conversion to int fixes this issue:
>>> int('2') > int('100')
False
Also take a look at pandas, if you plane to make more complicated manipulations later, for example:
import pandas as pd
pd.read_table('filename.txt', header=None)\
.sort(columns=0)\
.to_csv('newfile.txt', sep='\t', header=None, index=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With