Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression replace in C#

Tags:

c#

regex

I'm fairly new to using regular expressions, and, based on a few tutorials I've read, I'm unable to get this step in my Regex.Replace formatted properly.

Here's the scenario I'm working on... When I pull my data from the listbox, I want to format it into a CSV like format, and then save the file. Is using the Replace option an ideal solution for this scenario?

Before the regular expression formatting example.

FirstName LastName Salary    Position ------------------------------------- John      Smith    $100,000.00  M 

Proposed format after regular expression replace

John Smith,100000,M 

Current formatting status output:

John,Smith,100000,M 

*Note - is there a way I can replace the first comma with a whitespace?

Snippet of my code

using(var fs = new FileStream(filepath, FileMode.OpenOrCreate, FileAccess.Write)) {     using(var sw = new StreamWriter(fs))     {         foreach (string stw in listBox1.Items)         {             StringBuilder sb = new StringBuilder();             sb.AppendLine(stw);              //Piecing the list back to the original format             sb_trim = Regex.Replace(stw, @"[$,]", "");             sb_trim = Regex.Replace(sb_trim, @"[.][0-9]+", "");             sb_trim = Regex.Replace(sb_trim, @"\s", ",");             sw.WriteLine(sb_trim);         }     } } 
like image 321
Curtis Avatar asked Apr 20 '13 05:04

Curtis


People also ask

Can I use regex in replace?

How to use RegEx with . replace in JavaScript. To use RegEx, the first argument of replace will be replaced with regex syntax, for example /regex/ . This syntax serves as a pattern where any parts of the string that match it will be replaced with the new substring.

What is $1 in string replace?

The $ number language element includes the last substring matched by the number capturing group in the replacement string, where number is the index of the capturing group. For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group.

How does regex replace work C#?

In a specified input string, replaces all strings that match a specified regular expression with a specified replacement string. In a specified input string, replaces all substrings that match a specified regular expression with a string returned by a MatchEvaluator delegate.

Can you use regex in C?

A regular expression is a sequence of characters used to match a pattern to a string. The expression can be used for searching text and validating input. Remember, a regular expression is not the property of a particular language. POSIX is a well-known library used for regular expressions in C.


2 Answers

You can do it this with two replace's

//let stw be "John Smith $100,000.00 M"  sb_trim = Regex.Replace(stw, @"\s+\$|\s+(?=\w+$)", ","); //sb_trim becomes "John Smith,100,000.00,M"  sb_trim = Regex.Replace(sb_trim, @"(?<=\d),(?=\d)|[.]0+(?=,)", ""); //sb_trim becomes "John Smith,100000,M"  sw.WriteLine(sb_trim); 
like image 153
Anirudha Avatar answered Sep 19 '22 15:09

Anirudha


Try this::

sb_trim = Regex.Replace(stw, @"(\D+)\s+\$([\d,]+)\.\d+\s+(.)",     m => string.Format(         "{0},{1},{2}",         m.Groups[1].Value,         m.Groups[2].Value.Replace(",", string.Empty),         m.Groups[3].Value)); 

This is about as clean an answer as you'll get, at least with regexes.

  • (\D+): First capture group. One or more non-digit characters.
  • \s+\$: One or more spacing characters, then a literal dollar sign ($).
  • ([\d,]+): Second capture group. One or more digits and/or commas.
  • \.\d+: Decimal point, then at least one digit.
  • \s+: One or more spacing characters.
  • (.): Third capture group. Any non-line-breaking character.

The second capture group additionally needs to have its commas stripped. You could do this with another regex, but it's really unnecessary and bad for performance. This is why we need to use a lambda expression and string format to piece together the replacement. If it weren't for that, we could just use this as the replacement, in place of the lambda expression:

"$1,$2,$3" 
like image 30
Zenexer Avatar answered Sep 20 '22 15:09

Zenexer