I have a data stream that may contain \r, \n, \r\n, \n\r or any combination of them. Is there a simple way to normalize the data to make all of them simply become \r\n pairs to make display more consistent?
So something that would yield this kind of translation table:
\r --> \r\n
\n --> \r\n
\n\n --> \r\n\r\n
\n\r --> \r\n
\r\n --> \r\n
\r\n\n --> \r\n\r\n
I believe this will do what you need:
using System.Text.RegularExpressions;
// ...
string normalized = Regex.Replace(originalString, @"\r\n|\n\r|\n|\r", "\r\n");
I'm not 100% sure on the exact syntax, and I don't have a .Net compiler handy to check. I wrote it in perl, and converted it into (hopefully correct) C#. The only real trick is to match "\r\n" and "\n\r" first.
To apply it to an entire stream, just run in on chunks of input. (You could do this with a stream wrapper if you want.)
The original perl:
$str =~ s/\r\n|\n\r|\n|\r/\r\n/g;
The test results:
[bash$] ./test.pl
\r -> \r\n
\n -> \r\n
\n\n -> \r\n\r\n
\n\r -> \r\n
\r\n -> \r\n
\r\n\n -> \r\n\r\n
Update: Now converts \n\r to \r\n, though I wouldn't call that normalization.
I'm with Jamie Zawinski on RegEx:
"Some people, when confronted with a problem, think "I know, I’ll use regular expressions." Now they have two problems"
For those of us who prefer readability:
Step 1
Replace \r\n by \n
Replace \n\r by \n (if you really want this, some posters seem to think not)
Replace \r by \n
Step 2 Replace \n by Environment.NewLine or \r\n or whatever.
Normalise breaks, so that they are all \r\n
var normalisedString =
sourceString
.Replace("\r\n", "\n")
.Replace("\n\r", "\n")
.Replace("\r", "\n")
.Replace("\n", "\r\n");
It's a two step process.
First you convert all the combinations of \r
and \n
into a single one, say \r
Then you convert all the \r
into your target \r\n
normalized =
original.Replace("\r\n", "\r").
Replace("\n\r", "\r").
Replace("\n", "\r").
Replace("\r", "\r\n"); // last step
A Regex would help.. could do something roughly like this..
(\r\n|\n\n|\n\r|\r|\n) replace with \r\n
This regex produced these results from the table posted (just testing left side) so a replace should normalize.
\r => \r
\n => \n
\n\n => \n\n
\n\r => \n\r
\r\n => \r\n
\r\n => \r\n
\n => \n
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With