Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find all the words starting with '$' sign and ending with space, in a long string?

Tags:

string

c#

regex

In C#, how do I find all the words starting with '$' sign and ending with space, in a long string, using regular expressions?

like image 374
Zain Shaikh Avatar asked Oct 21 '10 11:10

Zain Shaikh


People also ask

How do you search for a regex pattern at the beginning of a string?

The meta character “^” matches the beginning of a particular string i.e. it matches the first character of the string. For example, The expression “^\d” matches the string/line starting with a digit. The expression “^[a-z]” matches the string/line starting with a lower case alphabet.

Which matches the start and end of the string?

Explanation: '^' (carat) matches the start of the string. '$' (dollar sign) matches the end of the string. Sanfoundry Certification Contest of the Month is Live. 100+ Subjects.

How do you specify an end in regex?

End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.


2 Answers

Try:

var matches = Regex.Matches(input, "(\\$\\w+) ");

In the above, \\w matches word characters. These are A-Z, a-z, - and _ if I'm correct. If you want to match everything that's not a space, you can use \\S. If you want a specific set, specify this through e.g. [a-zA-Z0-9].

The brackets around the (\\$\\w+) ensures that of a specific match, matches[0].Groups[1].Value; gives the value inside the backets (so, excluding the trailing space).

As a complete example:

string input = "$a1 $a2 $b1 $b2";

foreach (Match match in Regex.Matches(input, "(\\$\\w+) "))
{
    Console.WriteLine(match.Groups[1].Value);
}

This produces the following output:

$a1
$a2
$b1

The $b2 is of course omitted because it does not have a trailing space.

like image 149
Pieter van Ginkel Avatar answered Sep 30 '22 07:09

Pieter van Ginkel


You may try it without regular expressions, that may be faster.

string longText = "";
    List<string> found = new List<string>();
    foreach (var item in longText.Split(' '))
    {
        if (item.StartsWith("$"))
        {
            found.Add(item);
        }
    }

EDIT: After Zain Shaikh's comment I've written a simple program to benchmark, here goes the results.

        string input = "$a1 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2";
        var s1 = Stopwatch.StartNew();
        double first;
        foreach (Match match in Regex.Matches(input, "(\\$\\w+) "))
        {
        }
        s1.Stop();
        Console.WriteLine(" 1) " + (s1.Elapsed.TotalMilliseconds * 1000 * 1000).ToString("0.00 ns"));
        first = s1.Elapsed.TotalMilliseconds;
        s1.Reset();

        s1 = Stopwatch.StartNew();

        foreach (var item in input.Split(' '))
        {
            if (item.StartsWith("$"))
            {
            }
        }
        s1.Stop();
        Console.WriteLine(" 2) " + (s1.Elapsed.TotalMilliseconds * 1000 * 1000).ToString("0.00 ns"));
        Console.WriteLine(s1.Elapsed.TotalMilliseconds - first);

Output:

1) 730600.00 ns

2)  53000.00 ns

-0.6776

That means string functions (also with foreach) are faster than regular expression functions ;)

like image 33
Ahmet Kakıcı Avatar answered Sep 30 '22 07:09

Ahmet Kakıcı