Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove \r\n in awk

Tags:

linux

awk

I have a simple awk command that converts a date from MM/DD/YYYY to YYYY/MM/DD. However, the file I'm using has \r\n at the end of the lines, and sometimes the date is at the end of the line.

awk '
  BEGIN { FS = OFS = "|" }
  {
    split($27, date, /\//)
    $27 = date[3] "/" date[1] "/" date[2]

    print $0
  }
' file.txt

In this case, if the date is MM/DD/YYYY\r\n then I end up with this in the output:

YYYY
/MM/DD

What is the best way to get around this? Keep in mind, sometimes the input is simply \r\n in which case the output SHOULD be // but instead ends up as

/
/
like image 248
richie Avatar asked Apr 09 '17 17:04

richie


People also ask

How do I get rid of the new line in awk?

Use printf() when you want awk without printing newline AWK printf duplicates the printf C library function writing to screen/stdout.

What does NR mean in awk?

NR: NR command keeps a current count of the number of input records. Remember that records are usually lines. Awk command performs the pattern/action statements once for each record in a file. NF: NF command keeps a count of the number of fields within the current input record.

What is Rs in awk?

awk Built-in Variables RS - Input Record Separator This variable is used to set input record separator, by default a newline.


1 Answers

Given that the \r isn't always at the end of field $27, the simplest approach is to remove the \r from the entire line.

With GNU Awk or Mawk (one of which is typically the default awk on Linux platforms), you can simply define your input record separator, RS, accordingly:

awk -v RS='\r\n' ...

Or, if you want \r\n-terminated output lines too, set the output record separator, ORS, to the same value:

awk 'BEGIN { RS=ORS="\r\n"; ... 

Optional reading: an aside for BSD/macOS Awk users:

BSD/macOS awk doesn't support multi-character RS values (in line with the POSIX Awk spec: "If RS contains more than one character, the results are unspecified").

Therefore, a sub call inside the Awk script is necessary to trim the \r instance from the end of each input line:

awk '{ sub("\r$", ""); ... 

To also output \r\n-terminated lines, option -v ORS='\r\n' (or ORS="\r\n" inside the script's BEGIN block) will work fine, as with GNU Awk and Mawk.

like image 151
mklement0 Avatar answered Oct 20 '22 00:10

mklement0