Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find specific pattern in string using C#

Tags:

string

c#

regex

I am trying to find, and remove, a specific pattern inside a string with C#.

The pattern is an asterisk, followed by any number of numbers, followed by .txt

Example strings:

  1. test*123.txt
  2. test2*1.txt
  3. test*1234.txt3
  4. test4*12.txt123

Given these examples, the desired results would be:

  1. test ("*123.txt" was removed)
  2. test2 ("*1.txt" was removed)
  3. test3 ("*1234.txt" was removed)
  4. test4123 ("*12.txt" was removed)

How can this be accomplished?

like image 944
Sesame Avatar asked Mar 02 '11 20:03

Sesame


2 Answers

string pattern = @"\*\d*\.txt";
Regex rgx = new Regex(pattern)
input = rgx.Replace(input, "");
like image 181
mellamokb Avatar answered Oct 13 '22 07:10

mellamokb


If you build a regular expression and replace its matches with an empty string, you're effectively removing that pattern. Here's what you'll need for your pattern:

  1. An asterisk has a special meaning in a regular expression (zero or more of the previous item), so you'll have to escape it with a backslash (\*).

  2. You can match a digit with the digit character class (\d) or with an explicit class that includes all of them ([0-9]). There are differences between them because of culture settings: \d can match things like eastern arabic numerals (٠.١.٢.٣.٤.٥.٦.٧.٨.٩), while [0-9] will match only the hindu-arabic numerals (0, 1, 2, 3, 4, 5, 6, 7, 8, 9).

  3. You can use a + quantifier to match one or more of the previous item: \d+ will match one or more digits.

  4. A dot is another special character (it matches any single character except for newlines). It will also need escaping (\.).

  5. You can match text without special characters with the text itself: txt matches exactly txt.

Putting everything together we get:

string purged = Regex.Replace(input, @"\*[0-9]+\.txt", "");
like image 40
R. Martinho Fernandes Avatar answered Oct 13 '22 06:10

R. Martinho Fernandes