I am new to shell scripting, it would be great if I can get some help with the question below.
I want to read a text file line by line, and print all matched patterns in that line to a line in a new text file.
For example:
$ cat input.txt
SYSTEM ERROR: EU-1C0A Report error -- SYSTEM ERROR: TM-0401 DEFAULT Test error
SYSTEM ERROR: MG-7688 DEFAULT error -- SYSTEM ERROR: DN-0A00 Error while getting object -- ERROR: DN-0A52 DEFAULT Error -- ERROR: MG-3218 error occured in HSSL
SYSTEM ERROR: DN-0A00 Error while getting object -- ERROR: DN-0A52 DEFAULT Error
SYSTEM ERROR: EU-1C0A error Failed to fill in test report -- ERROR: MG-7688
The intended output is as follows:
$ cat output.txt
EU-1C0A TM-0401
MG-7688 DN-0A00 DN-0A52 MG-3218
DN-0A00 DN-0A52
EU-1C0A MG-7688
I tried the following code:
while read p; do
grep -o '[A-Z]\{2\}-[A-Z0-9]\{4\}' | xargs
done < input.txt > output.txt
which produced this output:
EU-1C0A TM-0401 MG-7688 DN-0A00 DN-0A52 MG-3218 DN-0A00 DN-0A52 EU-1C0A MG-7688 .......
Then I also tried this:
while read p; do
grep -o '[A-Z]\{2\}-[A-Z0-9]\{4\}' | xargs > output.txt
done < input.txt
But did not help :(
Maybe there is another way, I am open to awk/sed/cut or whatever... :)
Note: There can be any number of Error codes (i.e. XX:XXXX, the pattern of interest in a single line).
We use the read command with -r argument to read the contents without escaping the backslash character. We read the content of each line and store that in the variable line and inside the while loop we echo with a formatted -e argument to use special characters like \n and print the contents of the line variable.
Method 1: Read a File Line by Line using readlines() readlines() is used to read all the lines at a single go and then return them as each line a string element in a list. This function can be used for small files, as it reads the whole file content to the memory, then split it into separate lines.
The read command reads the file line by line, assigning each line to the $line bash shell variable. Once all lines are read from the file the bash while loop will stop. The internal field separator (IFS) is set to the empty string to preserve whitespace issues. This is a fail-safe feature.
Using the grep Command. If we use the option '-A1', grep will output the matched line and the line after it.
% awk 'BEGIN{RS=": "};NR>1{printf "%s%s", $1, ($0~/\n/)?"\n":" "}' input.txt
EU-1C0A TM-0401
MG-7688 DN-0A00 DN-0A52 MG-3218
DN-0A00 DN-0A52
EU-1C0A MG-7688
Explanation in longform:
awk '
BEGIN{ RS=": " } # Set the record separator to colon-space
NR>1 { # Ignore the first record
printf("%s%s", # Print two strings:
$1, # 1. first field of the record (`$1`)
($0~/\n/) ? "\n" : " ")
# Ternary expression, read as `if condition (thing
# between brackets), then thing after `?`, otherwise
# thing after `:`.
# So: If the record ($0) matches (`~`) newline (`\n`),
# then put a newline. Otherwise, put a space.
}
' input.txt
Previous answer to the unmodified question:
% awk 'BEGIN{RS=": "};NR>1{printf "%s%s", $1, (NR%2==1)?"\n":" "}' input.txt
EU-1C0A TM-0401
MG-7688 MG-3218
DN-0A00 DN-0A52
EU-1C0A MG-7688
edit: With safeguard against :
-injection (thx @e0k). Tests that the first field after the record seperator looks like how we expect it to be.
awk 'BEGIN{RS=": "};NR>1 && $1 ~ /^[A-Z]{2}-[A-Z0-9]{4}$/ {printf "%s%s", $1, ($0~/\n/)?"\n":" "}' input.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With