I'm new to programming and I'm using C# 2010. There are some quite long (50 lines) regular expressions in code I need to maintain. Also I need to parse some text files with a lot of information. Can you recommend a tool for these tasks?
@Lincoln I sympathize with your problem. Unfortunately regexes have very little scope for internal documentation so a 50-line one is essentially like a binary program. Be aware that if you change 1 character in it the whole things will break. Here for example is a regex for a date:
^(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])$
Analyze this regular expression with RegexBuddy matches a date in
yyyy-mm-dd
format from between1900-01-01
and2099-12-31
, with a choice of four separators. The anchors make sure the entire variable is a date, and not a piece of text containing a date. The year is matched by(19|20)\d\d
. I used alternation to allow the first two digits to be19
or20
.
If you didn't know it was a date then it would require a detective-like or cryptanalytic approach to work out what it was doing. Regex buddy and so will help a bit, but not give the semantics.
My guess is that your 50-line regex (I shudder when I write those words) will have dates and company ids and addresses and goodness knows what embedded in it.
The only goodish news is that regexes are less dependent on the language than they used to be. So if it was originally written in Java it probably works in C# and vice versa.
Is it simply used for identifying fields or are there capture groups? These are balanced brackets which extract subfields into a program through an API. By examining what these fields contain you may have a useful pointer to what the regex does.
Pragmatically, unless it's on the critical path, try to touch it as little as possible!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With