Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read line by line and print matches line by line

I am new to shell scripting, it would be great if I can get some help with the question below.

I want to read a text file line by line, and print all matched patterns in that line to a line in a new text file.

For example:

$ cat input.txt

SYSTEM ERROR: EU-1C0A  Report error -- SYSTEM ERROR: TM-0401 DEFAULT Test error
SYSTEM ERROR: MG-7688 DEFAULT error -- SYSTEM ERROR: DN-0A00 Error while getting object -- ERROR: DN-0A52 DEFAULT Error -- ERROR: MG-3218 error occured in HSSL
SYSTEM ERROR: DN-0A00 Error while getting object -- ERROR: DN-0A52 DEFAULT Error
SYSTEM ERROR: EU-1C0A  error Failed to fill in test report -- ERROR: MG-7688

The intended output is as follows:

$ cat output.txt

EU-1C0A TM-0401
MG-7688 DN-0A00 DN-0A52 MG-3218
DN-0A00 DN-0A52
EU-1C0A MG-7688

I tried the following code:

while read p; do
    grep -o '[A-Z]\{2\}-[A-Z0-9]\{4\}' | xargs
done < input.txt > output.txt

which produced this output:

EU-1C0A TM-0401 MG-7688 DN-0A00 DN-0A52 MG-3218 DN-0A00 DN-0A52 EU-1C0A MG-7688 .......

Then I also tried this:

while read p; do
    grep -o '[A-Z]\{2\}-[A-Z0-9]\{4\}' | xargs > output.txt
done < input.txt

But did not help :(

Maybe there is another way, I am open to awk/sed/cut or whatever... :)

Note: There can be any number of Error codes (i.e. XX:XXXX, the pattern of interest in a single line).

like image 360
Dinesh Kumar Avatar asked Dec 09 '16 19:12

Dinesh Kumar


People also ask

How read file line by line in shell script and store each line in a variable?

We use the read command with -r argument to read the contents without escaping the backslash character. We read the content of each line and store that in the variable line and inside the while loop we echo with a formatted -e argument to use special characters like \n and print the contents of the line variable.

How do I read a file line by line?

Method 1: Read a File Line by Line using readlines() readlines() is used to read all the lines at a single go and then return them as each line a string element in a list. This function can be used for small files, as it reads the whole file content to the memory, then split it into separate lines.

What is IFS readline?

The read command reads the file line by line, assigning each line to the $line bash shell variable. Once all lines are read from the file the bash while loop will stop. The internal field separator (IFS) is set to the empty string to preserve whitespace issues. This is a fail-safe feature.

How do you grep a line after a match?

Using the grep Command. If we use the option '-A1', grep will output the matched line and the line after it.


1 Answers

% awk 'BEGIN{RS=": "};NR>1{printf "%s%s", $1, ($0~/\n/)?"\n":" "}' input.txt 
EU-1C0A TM-0401
MG-7688 DN-0A00 DN-0A52 MG-3218
DN-0A00 DN-0A52
EU-1C0A MG-7688

Explanation in longform:

awk '
    BEGIN{ RS=": " } # Set the record separator to colon-space
    NR>1 {           # Ignore the first record
        printf("%s%s", # Print two strings:
            $1,      # 1. first field of the record (`$1`)
            ($0~/\n/) ? "\n" : " ")
                     # Ternary expression, read as `if condition (thing
                     # between brackets), then thing after `?`, otherwise
                     # thing after `:`.
                     # So: If the record ($0) matches (`~`) newline (`\n`),
                     # then put a newline. Otherwise, put a space.
    }
' input.txt 

Previous answer to the unmodified question:

% awk 'BEGIN{RS=": "};NR>1{printf "%s%s", $1, (NR%2==1)?"\n":" "}' input.txt 
EU-1C0A TM-0401
MG-7688 MG-3218
DN-0A00 DN-0A52
EU-1C0A MG-7688

edit: With safeguard against :-injection (thx @e0k). Tests that the first field after the record seperator looks like how we expect it to be.

awk 'BEGIN{RS=": "};NR>1 && $1 ~ /^[A-Z]{2}-[A-Z0-9]{4}$/ {printf "%s%s", $1, ($0~/\n/)?"\n":" "}' input.txt
like image 133
joepd Avatar answered Oct 15 '22 02:10

joepd