Have some delimited files with improperly placed newline characters in the middle of fields (not line ends), appearing as ^M in Vim. They originate from freebcp (on Centos 6) exports of a MSSQL database. Dumping the data in hex shows \r\n patterns: <pre class="prettyprint"><code>$ xxd test.txt | grep 0d0a 0000190: 3932 3139 322d 3239 3836 0d0a 0d0a 7c43 </code></pre> I can remove them with awk, but am unable to do the same with sed. This works in awk, removing the line breaks completely: <pre class="prettyprint"><code>awk 'gsub(/\r/,""){printf $0;next}{print}' </code></pre> But this in sed does not, leaving line feeds in place: <pre class="prettyprint"><code>sed -i 's/\r//g' </code></pre> where this appears to have no effect: <pre class="prettyprint"><code>sed -i 's/\r\n//g' </code></pre> Using ^M in the sed expression (ctrl+v, ctrl+m) also does not seem to work. For this sort of task, sed is easier to grok, but I am working on learning more about both. Am I using sed improperly, or is there a limitation?

You can use the command line tool <code>dos2unix</code> <pre class="prettyprint"><code>dos2unix input </code></pre> Or use the <code>tr</code> command: <pre class="prettyprint"><code>tr -d '\r' <input >output </code></pre> <hr> Actually, you can do the file-format switching in <code>vim</code>: Method A: <pre class="prettyprint"><code>:e ++ff=dos :w ++ff=unix :e! </code></pre> Method B: <pre class="prettyprint"><code>:e ++ff=dos :set ff=unix :w </code></pre> <hr> <h3>EDIT</h3> If you want to delete the <code>\r\n</code> sequences in the file, try these commands in <code>vim</code>: <pre class="prettyprint"><code>:e ++ff=unix " <-- make sure open with UNIX format :%s/\r\n//g " <-- remove all \r\n :w " <-- save file </code></pre> Your <code>awk</code> solution works fine. Another two <code>sed</code> solutions: <pre class="prettyprint"><code>sed '1h;1!H;$!d;${g;s/\r\n//g}' input sed ':A;/\r$/{N;bA};s/\r\n//g' input </code></pre>

I believe some versions of <code>sed</code> will not recognize <code>\r</code> as a character. However, you can use a <code>bash</code> feature to work around that limitation: <pre class="prettyprint"><code>echo $string | sed $'s/\r//' </code></pre> Here, you let <code>bash</code> replace '\r' with the actual carriage return character inside the <code>$'...'</code> construct before passing that to <code>sed</code> as its command. (Assuming you use <code>bash</code>; other shells should have a similar construct.)

<code>sed -e 's/\r//g' input_file</code> This works for me. The difference of -e instead of -i command. Also I mentioned that see on different platforms behave differently. Mine is:<code>sed --version This is not GNU sed version 4.0</code>

Another method <pre class="prettyprint"><code>awk 1 RS='\r\n' ORS= </code></pre> <ul> <li>set Record Separator to <code>\r\n</code> </li> <li>set Output Record Separator to empty string</li> <li> <code>1</code> is always true, and in the absence of an action block <code>{print}</code> is used</li> </ul>

Removing Windows newlines on Linux (sed vs. awk)

Tags:

linux

sed

awk

Have some delimited files with improperly placed newline characters in the middle of fields (not line ends), appearing as ^M in Vim. They originate from freebcp (on Centos 6) exports of a MSSQL database. Dumping the data in hex shows \r\n patterns:

$ xxd test.txt | grep 0d0a
0000190: 3932 3139 322d 3239 3836 0d0a 0d0a 7c43

I can remove them with awk, but am unable to do the same with sed.

This works in awk, removing the line breaks completely:

awk 'gsub(/\r/,""){printf $0;next}{print}'

But this in sed does not, leaving line feeds in place:

sed -i 's/\r//g'

where this appears to have no effect:

sed -i 's/\r\n//g'

Using ^M in the sed expression (ctrl+v, ctrl+m) also does not seem to work.

For this sort of task, sed is easier to grok, but I am working on learning more about both. Am I using sed improperly, or is there a limitation?

919

asked Jul 27 '12 02:07

kermatt

4 Answers

You can use the command line tool dos2unix

dos2unix input

Or use the tr command:

tr -d '\r' <input >output

Actually, you can do the file-format switching in vim:

Method A:

:e ++ff=dos
:w ++ff=unix
:e!

Method B:

:e ++ff=dos
:set ff=unix
:w

EDIT

If you want to delete the \r\n sequences in the file, try these commands in vim:

:e ++ff=unix           " <-- make sure open with UNIX format
:%s/\r\n//g            " <-- remove all \r\n
:w                     " <-- save file

Your awk solution works fine. Another two sed solutions:

sed '1h;1!H;$!d;${g;s/\r\n//g}' input
sed ':A;/\r$/{N;bA};s/\r\n//g' input

153

answered Oct 03 '22 07:10

kev

I believe some versions of sed will not recognize \r as a character. However, you can use a bash feature to work around that limitation:

echo $string | sed $'s/\r//'

Here, you let bash replace '\r' with the actual carriage return character inside the $'...' construct before passing that to sed as its command. (Assuming you use bash; other shells should have a similar construct.)

answered Oct 05 '22 07:10

chepner

sed -e 's/\r//g' input_file

This works for me. The difference of -e instead of -i command.

Also I mentioned that see on different platforms behave differently. Mine is:sed --version This is not GNU sed version 4.0

answered Oct 04 '22 07:10

Sergiy Dolnyy

Another method

awk 1 RS='\r\n' ORS=

set Record Separator to \r\n
set Output Record Separator to empty string
1 is always true, and in the absence of an action block {print} is used

answered Oct 01 '22 07:10

Zombo

Related questions
                            
                                Faster forking of large processes on Linux?
                            
                                Using mkdir -m -p and chown together correctly
                            
                                How can I get the architecture of a '.a' file?
                            
                                In R, using Ubuntu, try to install a lib depending on GMP C lib, it won't find GMP, but I have GMP installed
                            
                                User script location linux (debian etch) [closed]
                            
                                How to clear the scrollback in the screen command?
                            
                                How to exit a shell script if targeted file doesn't exist
                            
                                How to detect IP address change programmatically in Linux?
                            
                                How to add timestamp while redirecting stdout to file in Bash?
                            
                                Command Line to see the contents Shared Object Module(lib*.so)
                            
                                Linking Boost Library in Linux
                            
                                How do I recursively list all directories at a location, breadth-first?
                            
                                How to set "execute" attribute to a file and check it in SVN from Windows?
                            
                                MySQL won't start - error: su: warning: cannot change directory to /nonexistent: No such file or directory
                            
                                Check that there are at least two arguments given in a bash script
                            
                                PHP CURL Enable Linux
                            
                                Get free disk space with df to just display free space in kb?
                            
                                Convert string to hexadecimal on command line
                            
                                echo "string" | xclip -selection clipboard , copies the 'string' but also adds a new line to it. how to fix this?
                            
                                How to find duplicate files with same name but in different case that exist in same directory in Linux?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With