I'm using <code>awk '{gsub(/^[ \t]+|[ \t]+$/,""); print;}' in.txt > out.txt</code> to remove both leading and trailing whitespaces. The problem is the output file actually has trailing whitespaces! All lines are of the same length - they are right padded with spaces. What am I missing? UPDATE 1 The problem is probably due to the the fact that the trailing spaces are nor "normal" spaces but \x20 characters (DC4). UPDATE 2 I used <code>gsub (/'[[:cntrl:]]|[[:space:]]|\x20/,"")</code> an it worked. Two strange things: <ol> <li>Why isn't \x20 considered a control character?</li> <li>Using <code>'[[:cntrl:][:space:]\x20</code> does NOT work. Why?</li> </ol>

This command works for me: <pre class="prettyprint"><code>$ awk '{$1=$1}1' file.txt </code></pre>

Your code is OK for me. You may have something else than <code>space</code> and <code>tabulation</code>... <code>hexdump -C</code> may help you to check what is wrong: <pre class="prettyprint"><code>awk '{gsub(/^[ \t]+|[ \t]+$/,""); print;}' in.txt | hexdump -C | less </code></pre> <h3>UPDATE:</h3> OK you identified DC4 (there may be some other control characters...) Then, you can improve your command: <pre class="prettyprint"><code>awk '{gsub(/^[[:cntrl:][:space:]]+|[[:cntrl:][:space:]]+$/,""); print;}' in.txt > out.txt </code></pre> See <code>awk</code> manpage: <code>[:alnum:] Alphanumeric characters.</code> <code>[:alpha:] Alphabetic characters.</code> <code>[:blank:] Space or tab characters.</code> <code>[:cntrl:] Control characters.</code> <code>[:digit:] Numeric characters.</code> <code>[:graph:] Characters that are both printable and visible. (A space is printable, but not visible, while an a is both.)</code> <code>[:lower:] Lower-case alphabetic characters.</code> <code>[:print:] Printable characters (characters that are not control characters.)</code> <code>[:punct:] Punctuation characters (characters that are not letter, digits, control characters, or space characters).</code> <code>[:space:] Space characters (such as space, tab, and formfeed, to name a few).</code> <code>[:upper:] Upper-case alphabetic characters.</code> <code>[:xdigit:] Characters that are hexadecimal digits.</code> <h3>Leading/Trailing <code>0x20</code> removal</h3> For me the command is OK, I have tested like this: <pre class="prettyprint"><code>$ echo -e "\x20 \tTEXT\x20 \t" | hexdump -C 00000000 20 20 09 54 45 58 54 20 20 09 0a | .TEXT ..| 0000000b $ echo -e "\x20 \tTEXT\x20 \t" | awk '{gsub(/^[[:cntrl:][:space:]]+|[[:cntrl:][:space:]]+$/,""); print;}' | hexdump -C 00000000 54 45 58 54 0a |TEXT.| 00000005 </code></pre> However if you have <code>0x20</code> in the middle of your text => then it is not removed. But this is not your question, isn't it?

How to remove leading and trailing whitespaces?

2 Answers

This command works for me:

$ awk '{$1=$1}1' file.txt

193

answered Sep 17 '22 14:09

kev

Your code is OK for me.
You may have something else than space and tabulation...
hexdump -C may help you to check what is wrong:

awk '{gsub(/^[ \t]+|[ \t]+$/,""); print;}' in.txt | hexdump -C | less

UPDATE:

OK you identified DC4 (there may be some other control characters...)
Then, you can improve your command:

awk '{gsub(/^[[:cntrl:][:space:]]+|[[:cntrl:][:space:]]+$/,""); print;}' in.txt > out.txt

See awk manpage:

[:alnum:] Alphanumeric characters.
[:alpha:] Alphabetic characters.
[:blank:] Space or tab characters.
[:cntrl:] Control characters.
[:digit:] Numeric characters.
[:graph:] Characters that are both printable and visible. (A space is printable, but not visible, while an a is both.)
[:lower:] Lower-case alphabetic characters.
[:print:] Printable characters (characters that are not control characters.)
[:punct:] Punctuation characters (characters that are not letter, digits, control characters, or space characters).
[:space:] Space characters (such as space, tab, and formfeed, to name a few).
[:upper:] Upper-case alphabetic characters.
[:xdigit:] Characters that are hexadecimal digits.

Leading/Trailing `0x20` removal

For me the command is OK, I have tested like this:

$ echo -e "\x20 \tTEXT\x20 \t" | hexdump -C
00000000  20 20 09 54 45 58 54 20  20 09 0a                 |  .TEXT  ..|
0000000b
$ echo -e "\x20 \tTEXT\x20 \t" | awk '{gsub(/^[[:cntrl:][:space:]]+|[[:cntrl:][:space:]]+$/,""); print;}' | hexdump -C
00000000  54 45 58 54 0a                                    |TEXT.|
00000005

However if you have 0x20 in the middle of your text
=> then it is not removed.
But this is not your question, isn't it?

answered Sep 16 '22 14:09

oHo

Related questions
                            
                                AWK, SED, REGEX to rename files
                            
                                unix tr find and replace
                            
                                Awk strftime on Mac OS X
                            
                                Bash count number of numbers
                            
                                Print a comma except on the last line in Awk
                            
                                How to get a list of internal IP addresses of GCE instances
                            
                                difference between number in the same column using AWK
                            
                                Removing multiple delimiters between outside delimiters on each line
                            
                                Delete first and last line or record from file using sed
                            
                                right tool to filter the UUID from the output of blkid program (using grep, cut, or awk, e.t.c)
                            
                                Peek at next line, but don't consume it
                            
                                Setting variable in bash -c
                            
                                awk print is adding a control-m character at end of line
                            
                                join two csv files with key value
                            
                                Awk conditional sum from a CSV file
                            
                                AWK: replace and write a column value in the input file
                            
                                Explain this duplicate line removing, order retaining, one-line AWK command
                            
                                "BEGIN blocks must have an action part" error in awk script
                            
                                Use of `NF` in awk command
                            
                                awk last argument before NF?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to remove leading and trailing whitespaces?

Tags:

whitespace

removing-whitespace

awk

gsub

user1194552

People also ask

2 Answers

kev

UPDATE:

Leading/Trailing `0x20` removal

oHo

Recent Activity

Donate For Us

How to remove leading and trailing whitespaces?

Tags:

whitespace

removing-whitespace

awk

gsub

user1194552

People also ask

2 Answers

kev

UPDATE:

Leading/Trailing 0x20 removal

oHo

Related questions

Recent Activity

Donate For Us

Leading/Trailing `0x20` removal