Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort the file in unix using first six characters of a line

Tags:

unix

sorting

I want to sort the file using first six characters of a line. It should ignore the default sort order after the sixth character. I have tried using the below command, but the system takes the default sort order after the sixth character.

sort -k 1,6 filename.txt

Input File : "filename.txt"

09289720150531N201505220820D20150514
09289720150531N201505220820A20150516
08806020150531N201505290810D20150526
08806020150531N201505290810A20150528

Output should be:

08806020150531N201505290810D20150526
08806020150531N201505290810A20150528
09289720150531N201505220820D20150514
09289720150531N201505220820A20150516

But my command output is:

08806020150531N201505290810A20150528
08806020150531N201505290810D20150526
09289720150531N201505220820A20150516
09289720150531N201505220820D20150514
like image 531
srisriv Avatar asked Jun 01 '15 09:06

srisriv


People also ask

How do I sort the contents of a file in Unix?

The sort command sorts the contents of a file, in numeric or alphabetic order, and prints the results to standard output (usually the terminal screen). The original file is unaffected. The output of the sort command will then be stored in a file named newfilename in the current directory.

How do I sort a line in Linux?

To sort lines of text files, we use the sort command in the Linux system. The sort command is used to prints the lines of its input or concatenation of all files listed in its argument list in sorted order. The operation of sorting is done based on one or more sort keys extracted from each line of input.

How do I sort files in Unix by name?

If you add the -X option, ls will sort files by name within each extension category. For example, it will list files without extensions first (in alphanumeric order) followed by files with extensions like . 1, . bz2, .


1 Answers

The option as shown uses the field position. If you change that to something like -k1.1,1.6 it will use the character position in the first field. This is an extended POSIX feature, likely to be provided on most platforms.

However, in your example there are only two distinct values in character positions 1-6: 088060 and 092897. The standard sort command does not have a feature for ignoring columns, but only for using columns. While GNU sort provides an extension (-s for "disabling last-resort comparison"), Solaris sort does not have such an extension. After the sort-keys have been taken into account, it sorts by the remainder of the lines.

There is some vague wording in its manual which hints that -u will do what you want:

When there are multiple key fields, later keys are compared only after all earlier keys compare equal. Except when the -u option is specified, lines that otherwise compare equal are ordered as if none of the options -d, -f, -i, -n or -k were present (but with -r still in effect, if it was specified) and with all bytes in the lines significant to the comparison.

However — revisiting this — the wording from this is misleading since -u is used to filter duplicates.

A comment suggests that -k1.1,1.6 could be shortened to -k1.6, and testing with Solaris 10 confirmed that would work. That is with /usr/bin/sort, of course. On my copy of Solaris 10, there is an additional copy of sort, in /opt/sfw/bin/sort:

$ /opt/sfw/bin/sort --version
sort (GNU coreutils) 5.97

and that program supports the -s option noted above. With that option, the program produces the output which was requested.

like image 121
Thomas Dickey Avatar answered Nov 15 '22 07:11

Thomas Dickey