Have seen many posts asking similar question. Can't get it working. Input looks like: <pre class="prettyprint"><code><field one with spaces>|<field two with spaces> </code></pre> Trying to parse with awk. Have tried many variants from excellent posts: <pre class="prettyprint"><code>FS = "^[\x00- ]*|[\x00- ]*[|][\x00- ]*|[\x00- ]*$"; FS = "^[\x00- ]*|[\x00- ]*\|[\x00- ]*|[\x00- ]*$"; FS = "^[\x00- ]*|[\x00- ]*\\|[\x00- ]*|[\x00- ]*$"; </code></pre> Still can't get the pipe delimiter to work. Using CentOS. Any help?

<pre class="prettyprint"><code> echo "field one has spaces | field two has spaces" \ | awk ' BEGIN { FS="|" } { print $2 print $1 # or what ever you want }' #output field two has spaces field one has spaces </code></pre> You can also reduce this to <pre class="prettyprint"><code>awk -F'|' { print $2 print $1 }' </code></pre> Edit Also, not all awks can take a multi-character regex for the <code>FS</code> value. Edit2 Somehow I missed this originally, but I see you are trying to include <code>\x00</code> in the char classes pre and post of the <code>|</code> char. I assume you mean for <code>\x00</code> == <code>null</code> char? I don't think you're going to be able to have <code>awk</code> parse a file with null chars embedded. You could prep-rocess your input like <pre class="prettyprint"><code> tr '\x00' ' ' < file.txt > spacesForNulls.txt </code></pre> OR delete them altogether with <pre class="prettyprint"><code>tr -d '\x00' < file.txt > deletedNulls.txt </code></pre> and eliminate that part of your regex. But as above, some <code>awk</code> don't support regex for the <code>FS</code> value. And, I don't use the <code>tr</code> trick very much, you may find that it requires a slightly different notation for the <code>null</code> char, depending on your version of <code>tr</code>. I hope this helps.

Parsing pipe delimited input in awk

Tags:

parsing

pipe

awk

delimited

Have seen many posts asking similar question. Can't get it working.

Input looks like:

<field one with spaces>|<field two with spaces>

Trying to parse with awk.

Have tried many variants from excellent posts:

FS = "^[\x00- ]*|[\x00- ]*[|][\x00- ]*|[\x00- ]*$";
FS = "^[\x00- ]*|[\x00- ]*\|[\x00- ]*|[\x00- ]*$";
FS = "^[\x00- ]*|[\x00- ]*\\|[\x00- ]*|[\x00- ]*$";

Still can't get the pipe delimiter to work.

Using CentOS.

Any help?

379

asked Aug 02 '11 19:08

scorpdaddy

1 Answers

 echo "field one has spaces | field two has spaces" \
 | awk '
   BEGIN {
      FS="|" 
 }
 {
   print $2
   print $1
   # or what ever you want
 }'

 #output

  field two has spaces
  field one has spaces

You can also reduce this to

awk -F'|' {
    print $2
    print $1
}'

Edit Also, not all awks can take a multi-character regex for the FS value.

Edit2 Somehow I missed this originally, but I see you are trying to include \x00 in the char classes pre and post of the | char. I assume you mean for \x00 == null char? I don't think you're going to be able to have awk parse a file with null chars embedded. You could prep-rocess your input like

 tr '\x00'   ' ' < file.txt > spacesForNulls.txt

OR delete them altogether with

tr -d '\x00' < file.txt > deletedNulls.txt

and eliminate that part of your regex. But as above, some awk don't support regex for the FS value. And, I don't use the tr trick very much, you may find that it requires a slightly different notation for the null char, depending on your version of tr.

I hope this helps.

156

answered Sep 28 '22 01:09

shellter

Related questions
                            
                                Parsing variable length strings of fixed column widths C#
                            
                                Eliminate Left Recursion on this PEG.js grammar
                            
                                Parsing SRT file with Objective C
                            
                                How do you parse large SQL scripts into batches?
                            
                                Parse href attribute value from element with Beautifulsoup and Mechanize
                            
                                Extract html attributes from string in PHP [duplicate]
                            
                                Issue parsing multiline JSON file using Python
                            
                                convert string csv to List of objects
                            
                                Get content of <script type="application/ld+json"> using PHP
                            
                                Convert String to URL in android/java [duplicate]
                            
                                golang get domain from email using parse standard library
                            
                                Counting the number of "0" in this factor
                            
                                Antlr parser operator priority
                            
                                How can I parse REXX code in Java?
                            
                                Overwrite a specific line in a text file using VB.NET
                            
                                Parsing XML data using php to put into mysql database
                            
                                How do I parsing a complex file format in Delphi? (Not CSV, XML, etc)
                            
                                PHP Version 5.2.14 / Parse error: syntax error, unexpected T_FUNCTION, expecting ')'
                            
                                How can fractional number expressions be parsed using pyparsing?
                            
                                Help parsing string (City, State Zip) with JavaScript

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With