Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match everything before a specific word in a multiline string

Tags:

c#

regex

I'm trying to filter out some garbage text from a string with regex but can't seem to get it to work. I'm not a regex expert (not even close) and I've searched for similar examples but none that seems to solve my problem.

I need a regex that matches everything from the start of a string to a specific word in that string but not the word itself.

here's an example:

<p>This is the string I want to process with as you can see also contains HTML tags like <i>this</i> and <strong>this</strong></p>
<p>I want to remove everything in the string BEFORE the word "giraffe" (but not "giraffe" itself and keep everything after it.</p>

So, how do I match everything in the string before the word "giraffe"?

Thanks!

like image 444
Joakim Megert Avatar asked Dec 27 '22 07:12

Joakim Megert


2 Answers

resultString = Regex.Replace(subjectString, 
    @"\A             # Start of string
    (?:              # Match...
     (?!""giraffe"") #  (unless we're at the start of the string ""giraffe"")
    .                #  any character (including newlines)
    )*               # zero or more times", 
    "", RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace);

should work.

like image 155
Tim Pietzcker Avatar answered Mar 08 '23 23:03

Tim Pietzcker


Why regex?

String s = "blagiraffe";
s = s.SubString(s.IndexOf("giraffe"));
like image 22
Jaster Avatar answered Mar 09 '23 00:03

Jaster