Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shell scripting cut -d " " -f4 file.txt command

Tags:

shell

I have a file with words separated by only single space. I want to read 4th word from each line of file using command:

cut -d " " -f4 file.txt

It works fine, but I don't understand its property.

If a line contains 4 or more words then it prints the 4th word.

If a line contains only 1 word then it prints that word.

If a line contains 2 or 3 words then it prints nothing.

I want to know that how it is working.

like image 414
jnrdn0011 Avatar asked Dec 22 '22 19:12

jnrdn0011


2 Answers

From man cut:

   -f, --fields=LIST
          select only these fields;  also print any line that contains no delimiter character, unless the -s option is specified

If a line contains 1 word, then it does not contain the delimiter and therefore cut prints the whole line (which is exactly that one word).

Other cases are obvious: the line contains at least one delimiter, therefore it prints the fourth word, if available.

If you add the -s parameter, it will print the fourth word only if available (and thus ignore lines with one word without delimiter).

like image 180
eumiro Avatar answered Dec 25 '22 09:12

eumiro


By default, cut expects each input line to contain the delimiter (space in the OP example). Lines that do not contain the delimited are printed as-is.

The default behavior can be changes with -s, which will always print the 4th column, even when the delimited is not found on the line (the case of ` word). Use

cut -s -d " " -f4 file.txt

As to the why this is the default behavior - no clear answer. Probably, this behavior was used to allow some lines to be excluded from the filtering. The initial Unix systems had lot of semi-structured files, where this functionality could have been used to process man pages, nroff pages and similar.

From the man page:

-f list

Cut based on a list of fields, assumed to be separated in the file by a delimiter character (see -d). Each selected field shall be output. Output fields shall be separated by a single occurrence of the field delimiter character. Lines with no field delimiters shall be passed through intact, unless -s is specified. It shall not be an error to select fields not present in the input line.

-s, --only-delimited do not print lines not containing delimiters

See also: https://unix.stackexchange.com/questions/157677/does-cut-return-any-fields-if-separator-does-not-exist

like image 36
dash-o Avatar answered Dec 25 '22 09:12

dash-o