Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Respect last line if it's not terminated with a new line char (\n) when using read

I have noticed for a while that read never actually reads the last line of a file if there is not, at the end of it, a "newline" character. This is understandable if one consider that, as long as there is not a "newline" character in a file, it is as if it contained 0 line (which is quite difficult to admit !). See, for example, the following:

$ echo 'foo' > bar ; wc -l bar
1 bar

But...

$ echo -n 'bar' > foo ; wc -l foo
0 foo

The question is then: how can I handle such situations when using read to process files which have not been created or modified by myself, and about which I don't know if they actually end up with a "newline" character ?

like image 327
michaelmeyer Avatar asked Jan 27 '13 05:01

michaelmeyer


People also ask

Is it n or \n for New line?

Operating systems have special characters denoting the start of a new line. For example, in Linux a new line is denoted by “\n”, also called a Line Feed. In Windows, a new line is denoted using “\r\n”, sometimes called a Carriage Return and Line Feed, or CRLF.

Is EOF and \n same?

A return value of EOF from fgetc() and friends can occur at any time - even in the middle of a line due to an error. Otherwise, no: A end-of-file does not occur at the end of a line with a '\n' . Typically the last line in a file that contains at least 1 character (and no `'\n') will also be a line.

How do you end a file with a new line in Linux?

For example, you can use the echo command to append the text to the end of the file as shown. Alternatively, you can use the printf command (do not forget to use \n character to add the next line). You can also use the cat command to concatenate text from one or more files and append it to another file.

Why is New line required at end of file?

If however you have a text file format where you require the newline, you get simple data verification very cheap: if the file ends with a line that has no newline at the end, you know the file is broken.


3 Answers

read does, in fact, read an unterminated line into the assigned var ($REPLY by default). It also returns false on such a line, which just means ‘end of file’; directly using its return value in the classic while loop thus skips that one last line. If you change the loop logic slightly, you can process non-new line terminated files correctly, without need for prior sanitisation, with read:

while read -r || [[ -n "$REPLY" ]]; do
    # your processing of $REPLY here
done < "/path/to/file"

Note this is much faster than solutions relying on externals.

Hat tip to Gordon Davisson for improving the loop logic.

like image 164
kopischke Avatar answered Oct 23 '22 04:10

kopischke


POSIX requires any line in a file have a newline character at the end to denote it is a line. But this site offers a solution to exactly the scenario you are describing. Final product is this chunklet.

newline='
'
lastline=$(tail -n 1 file; echo x); lastline=${lastline%x}
[ "${lastline#"${lastline%?}"}" != "$newline" ] && echo >> file
# Now file is sane; do our normal processing here...
like image 35
Giacomo1968 Avatar answered Oct 23 '22 04:10

Giacomo1968


If you must use read, try this:

awk '{ print $0}' foo | while read line; do
    echo the line is $line
done

as awk seems to recognize lines even without the newline char

like image 23
EJK Avatar answered Oct 23 '22 02:10

EJK