I have a data in a text file that contains "Test DATA_g004, Test DATA_g003, Test DATA_g001, Test DATA_g002".
Is it possible to sort it without the word "Test DATA_" so the data will be sorted like g001, g002, g003 etc?
I tried the .split("Test DATA_")
method but it doesn't work.
def readFile():
#try block will execute if the text file is found
try:
fileName = open("test.txt",'r')
data = fileName.read().split("\n")
data.sort (key=alphaNum_Key) #alternative sort function
print(data)
#catch block will execute if no text file is found
except IOError:
print("Error: File do not exist")
return
#Human sorting
def alphaNum(text):
return int(text) if text.isdigit() else text
#Human sorting
def alphaNum_Key(text):
return [ alphaNum(c) for c in re.split('(\d+)', text) ]
You can do this using re
.
import re
x="Test DATA_g004, Test DATA_g003, Test DATA_g001, Test DATA_g002"
print sorted(x.split(","),key= lambda k:int(re.findall("(?<=_g)\d+$",k)[0]))
Output:[' Test DATA_g001', ' Test DATA_g002', ' Test DATA_g003', 'Test DATA_g004']
Retrieve all strings starting with g
and then sort the list with sorted
>>> s = "Test DATA_g004, Test DATA_g003, Test DATA_g001, Test DATA_g002, "
>>> sorted(re.findall(r'g\d+$', s))
['g001', 'g002', 'g003', 'g004']
Another way, is to use only built-in methods:
>>> l = [x.split('_')[1] for x in s.split(', ') if x]
>>> l
['g004', 'g003', 'g001', 'g002']
>>> l.sort()
>>> l
['g001', 'g002', 'g003', 'g004']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With