Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Regex instance thread safe for matches in C#

Tags:

c#

regex

.net-4.0

I have this regex which I am using in a Parallel.ForEach<string>. Is it safe?

Regex reg = new Regex(SomeRegexStringWith2Groups); Parallel.ForEach<string>(MyStrings.ToArray(), (str) => {     foreach (Match match in reg.Matches(str)) //is this safe?         lock (dict) if (!dict.ContainsKey(match.Groups[1].Value))             dict.Add(match.Groups[1].Value, match.Groups[2].Value); }); 
like image 842
Arlen Beiler Avatar asked Oct 29 '12 20:10

Arlen Beiler


People also ask

Is regex match thread-safe?

The Regex class itself is thread safe and immutable (read-only). That is, Regex objects can be created on any thread and shared between threads; matching methods can be called from any thread and never alter any global state.

Is Java Regex thread-safe?

SUMMARY: The Java regular expression API has been designed to allow a single compiled pattern to be shared across multiple match operations. You can safely call Pattern. matcher() on the same pattern from different threads and safely use the matchers concurrently.

What is thread-safe in C sharp?

Thread safety is the technique which manipulates shared data structure in a manner that guarantees the safe execution of a piece of code by the multiple threads at the same time. A code is called thread-safe. If it is run concurrently without break function.


1 Answers

Regex objects are read-only, and therefore are thread safe. It's their returns, the Match objects that could potentially cause problems. MSDN confirms this:

The Regex class itself is thread safe and immutable (read-only). That is, Regex objects can be created on any thread and shared between threads; matching methods can be called from any thread and never alter any global state.

However, result objects (Match and MatchCollection) returned by Regex should be used on a single thread ..

I'd be concerned about how your Match collection is being generated in a way that might be concurrent, which could cause the collection to act kinda weird. Some Match implementations use delayed evaluation, which could cause some crazy behavior in that foreach loop. I would probably collect all the Matches and then evaluate them later, both to be safe and to get consistent performance.

like image 173
tmesser Avatar answered Oct 05 '22 23:10

tmesser