Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use RegEx to pick longest match?

Tags:

c#

regex

I tried looking for an answer to this question but just couldn't finding anything and I hope that there's an easy solution for this. I have and using the following code in C#,

String pattern = ("(hello|hello world)");
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
var matches = regex.Matches("hello world");

Question is, is there a way for the matches method to return the longest pattern first? In this case, I want to get "hello world" as my match as opposed to just "hello". This is just an example but my pattern list consist of decent amount of words in it.

like image 768
user3749947 Avatar asked Jun 17 '14 19:06

user3749947


People also ask

How do I match a range of numbers in regex?

With regex you have a couple of options to match a digit. You can use a number from 0 to 9 to match a single choice. Or you can match a range of digits with a character group e.g. [4-9]. If the character group allows any digit (i.e. [0-9]), it can be replaced with a shorthand (\d).

How do you match a whole expression in regex?

To run a “whole words only” search using a regular expression, simply place the word between two word boundaries, as we did with ‹ \bcat\b ›. The first ‹ \b › requires the ‹ c › to occur at the very start of the string, or after a nonword character.

How do I find the length of a string in regex?

To check the length of a string, a simple approach is to test against a regular expression that starts at the very beginning with a ^ and includes every character until the end by finishing with a $.

How do you match a full stop in regex?

The full stop character matches any single character of any sort (apart from a newline). For example, the regular expression ". at" means: any letter, followed by the letter `a', followed by the letter `t'.


1 Answers

If you already know the lengths of the words beforehand, then put the longest first. For example:

String pattern = ("(hello world|hello)");

The longest will be matched first. If you don't know the lengths beforehand, this isn't possible.

An alternative approach would be to store all the matches in an array/hash/list and pick the longest one manually, using the language's built-in functions.

like image 65
Amal Murali Avatar answered Nov 01 '22 07:11

Amal Murali