Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace a "new line" char with regex in C#

Tags:

c#

regex

I want to locate every string in a text file that contains the sequence "letter or number", "new line", "letter or number" then replace the "new line" with a "space".

This is what I've tried so far :

private void button3_Click(object sender, EventArgs e)
{
     string pathFOSE = @"D:\Public\temp\FOSEtest.txt";     
     string output = Regex.Replace(pathFOSE, @"(?<=\w)\n(?=\w)", " ");                      

     string pathNewFOSE = @"D:\Public\temp\NewFOSE.txt";
     if (!System.IO.File.Exists(pathNewFOSE))
     {
          // Create a file to write to. 
          using (System.IO.StreamWriter sw = System.IO.File.CreateText(pathNewFOSE))
          {                
          }
     File.AppendAllText(pathNewFOSE, output);
     }
}

But all my program does is create a new text file, containing only this line "D:\Public\temp\FOSEtest.txt"

Any idea of what's happening ? Also is \n the correct way of searching for new lines in a text file in Windows7? Thanks

Edit: I made the change suggested by Avinash and added that I was working on Windows 7.

Edit 2: I think I need to understand why the Replace is taking place on the path string and not on the file it leads to before trying suggestions.

Final Edit: Everything is working thanks to stribizhev, I just copy pasted his answer. Thanks to everyone that responded !

like image 744
Loukoum Mira Avatar asked Mar 16 '23 13:03

Loukoum Mira


2 Answers

You need to use positive lookbehind.

Regex.Replace(pathFOSE, @"(?<=\w)\n(?=\w)", " "); 
                            ^

(?=\w) called positive lookahead which asserts that the match must be followed by a word character.

or

Regex.Replace(pathFOSE, @"(?<=\w)[\r\n]+(?=\w)", " "); 
like image 144
Avinash Raj Avatar answered Mar 18 '23 02:03

Avinash Raj


In Windows, line breaks usually look like \r\n (caret return + line feed). So, you can match linebreaks that are preceded and followed by an alphanumeric with

string output = Regex.Replace(pathFOSE, @"(?<=\w)\r\n(?=\w)", " ");

Mind that \w matches Unicode letters and an underscore, too. If you do not need that behavior (and only need to match English letters) use

string output = Regex.Replace(pathFOSE, @"(?i)(?<=[a-z0-9])\r\n(?=[a-z0-9])", " ");

If you have a mixture of line breaks from various OSes or programs, you may use

string output = Regex.Replace(pathFOSE, @"(?i)(?<=[a-z0-9])(?:\r\n|\n|\r)(?=[a-z0-9])", " ");

And in case there are multiple line breaks, add a + quantifier (?:\r\n|\n|\r)+.

To perform search and replace on the file contents, you need to read the file in.

You can do it with

var pathFOSE = @"D:\Public\temp\FOSEtest.txt";
var contents = File.ReadAllText(pathFOSE);
var output = Regex.Replace(contents, @"(?i)(?<=[a-z0-9])(?:\r\n|\n|\r)(?=[a-z0-9])", " ");

var pathNewFOSE = @"D:\Public\temp\NewFOSE.txt";
if (!System.IO.File.Exists(pathNewFOSE))
{
    File.WriteAllText(pathNewFOSE, output);
}
like image 36
Wiktor Stribiżew Avatar answered Mar 18 '23 01:03

Wiktor Stribiżew