For example, I set up these:
L = /[a-z,A-Z,ßäüöÄÖÜ]/
V = /[äöüÄÖÜaeiouAEIOU]/
K = /[ßb-zBZ&&[^#{V}]]/
So that /(#{K}#{V}{2})/
matches "ᄚ"
in "azAZᄚ"
.
Are there any better ways of dealing with them?
Could I put those constants in a module in a file somewhere in my Ruby installation folder, so I can include/require them inside any new script I write on my computer? (I'm a newbie and I know I'm muddling this terminology; Please correct me.)
Furthermore, could I get just the meta-characters \L
, \V
, and \K
(or whatever isn't already set in Ruby) to stand for them in regexes, so I don't have to do that string interpolation thing all the time?
You're starting pretty well, but you need to look through the Regexp class code that is installed by Ruby. There are tricks for writing patterns that build themselves using String interpolation. You write the bricks and let Ruby build the walls and house with normal String tricks, then turn the resulting strings into true Regexp instances for use in your code.
For instance:
LOWER_CASE_CHARS = 'a-z'
UPPER_CASE_CHARS = 'A-Z'
CHARS = LOWER_CASE_CHARS + UPPER_CASE_CHARS
DIGITS = '0-9'
CHARS_REGEX = /[#{ CHARS }]/
DIGITS_REGEX = /[#{ DIGITS }]/
WORDS = "#{ CHARS }#{ DIGITS }_"
WORDS_REGEX = /[#{ WORDS }]/
You keep building from small atomic characters and character classes and soon you'll have big regular expressions. Try pasting those one by one into IRB and you'll quickly get the hang of it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With