Awk/sed replace newlines

Q: How do you use a new line character in sed?

By default, every line ends with \n when creating a file. The `sed` command can easily split on \n and replace the newline with any character. Another delimiter can be used in place of \n, but only when GNU sed is used. When the \n is missing in the last line of the file, GNU sed can avoid printing \n.

Q: How do you replace commas with new lines?

Select the cells containing the commas you need to replace with newlines, then press the Alt + F11 keys simultaneously to open the Microsoft Visual Basic for Applications window. 3. Press the F5 key or click the Run button to run the code. Then all commas in selected cells are replaced with newlines immediately.

Intro:

I have been given a CSV file in which the field delimiter is the pipe characted (i.e., |). This file has a pre-defined number of fields (say N). I can discover the value of N by reading the header of the CSV file, which we can assume to be correct.

Problem:

Some of the fields contain a newline character by mistake, which makes the line appear shorter than required (i.e., it has M fields, with M < N).

What I need to create is a sh script (not bash) to fix those lines.

Attempted solution:

I tried creating the following script to try fixing the file:

if [ $# -ne 1 ]
then
    echo "Usage: $0 <filename>"
    exit
fi

# get first line
first_line=$(head -n 1 $1)

# get number of fields
num_separators=$(echo "$first_line" | tr -d -c '|' | awk '{print length}')

cat $1  | awk -v numFields=$(( num_separators + 1 )) -F '|' '
{
    totRecords = NF/numFields
    # loop over lines
    for (record=0; record < totRecords; record++) {
        output = ""
        # loop over fields
        for (i=0; i<numFields; i++) {
            j = (numFields*record)+i+1 
            # replace newline with question mark
            sub("\n", "?", $j)
            output = output (i > 0 ? "|" : "") $j 
        }
        print output
    }
}
'

However, the newline character is still present. How can I fix that problem?

Example of the CSV:

FIRST_NAME|LAST_NAME|NOTES
John|Smith|This is a field with a
newline
Foo|Bar|Baz

Expected output:

FIRST_NAME|LAST_NAME|NOTES
John|Smith|This is a field with a * newline
Foo|Bar|Baz

* I don't care about the replacement, it could be a space, a question mark, whatever except a newline or a pipe (which would create a new field)

699

asked Jun 27 '16 16:06

user2340612

2 Answers

$ cat tst.awk
BEGIN { FS=OFS="|" }
NR==1 { reqdNF = NF; printf "%s", $0; next }
{ printf "%s%s", (NF < reqdNF ? " " : ORS), $0 }
END { print "" }

$ awk -f tst.awk file.csv
FIRST_NAME|LAST_NAME|NOTES
John|Smith|This is a field with a newline
Foo|Bar|Baz

If that's not what you want then edit your question to provide more truly representative sample input and associated output.

answered Sep 23 '22 13:09

Ed Morton

Based on the assumption that the last field may contain one newline. Using tac and sed:

tac file.csv | sed -n '/|/!{h;n;x;H;x;s/\n/ * /p;b};p' | tac

Output:

FIRST_NAME|LAST_NAME|NOTES
John|Smith|This is a field with a * newline
Foo|Bar|Baz

How it works. Read the file backwards, sed is easier without forward references. If a line has no '|' separator, /|/!, run the block of code in curly braces {};, otherwise just p print the line. The block of code:

h; stores the delimiter-less line in sed's hold buffer.
n; fetches another line, since we're reading backwards, this is the line that should be appended to.
x; exchange hold buffer and pattern buffer.
H; append pattern buffer to hold buffer.
x; exchange newly appended lines to pattern buffer, now there's two lines in one buffer.
s/\n/ * /p; replace the middle linefeed with a " * ", now there's only one longer line; and print.
b start again, leave the code block.

Re-reverse the file with tac; done.

answered Sep 21 '22 13:09

agc

Related questions
                            
                                Getting branch creation date from svn
                            
                                Extracting string between 2 strings with bash shell script
                            
                                Command line `qdel` command with wildcard operator
                            
                                Making paste command read from standard input
                            
                                Get return code of the mysql command
                            
                                Copy all files with certain extension to another directory, while changing the extension
                            
                                Shell programming: Select random line from command output
                            
                                How to invoke an oozie workflow via shell script and block/wait till workflow completion
                            
                                Reading through a text file line by line and checking if that line has a certain string
                            
                                Override invalid option message with getopts in bash
                            
                                How to run awk -F\' '{print $2}' inside subprocess.Popen in Python?
                            
                                Pass filename variable into sed command
                            
                                Shell scripts with Git Bash 64-bit on Windows
                            
                                How to execute python program using a shell script (and makefile?)
                            
                                Read from serial port and store in hexadecimal
                            
                                What is a hyphen beside a shell variable
                            
                                Piping to head results in broken pipe in shell script called from python
                            
                                shm_unlink from the shell?
                            
                                Python suppress shell output [duplicate]
                            
                                ZSH: Call in-built function from zsh function that uses the same name

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Awk/sed replace newlines

Tags:

shell

replace

csv

awk