I'm working on Linux, the sort command returns not as expected.
Input text:
$ cat input.txt
rep1_1.fq
rep1_2.fq
rep12_1.fq
rep12_2.fq
Command and output:
$ sort input.txt
rep1_1.fq
rep12_1.fq
rep12_2.fq
rep1_2.fq
$ sort --version
sort (GNU coreutils) 8.28
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Mike Haertel and Paul Eggert.
After sorting, I expected rep1_2.fq would be after rep1_1.fq, but the result is different.
Solved
according to @Federico klez Culloca's advice, use LC_ALL=C
$ LC_ALL=C sort input.txt
rep12_1.fq
rep12_2.fq
rep1_1.fq
rep1_2.fq
Edited
use LC_ALL=C also fix sorting files in a directory.
in case there are four files in current directory:
$ LC_ALL= ls
rep1_1.fq rep12_1.fq rep12_2.fq rep1_2.fq
$ LC_ALL=C ls
rep12_1.fq rep12_2.fq rep1_1.fq rep1_2.fq
Try with version-sort. From the manual:
-V, --version-sort
natural sort of (version) numbers within text
This is the output using your example:
$ sort -V input.txt
rep1_1.fq
rep1_2.fq
rep12_1.fq
rep12_2.fq
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With