Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex implementation to replace group with its lowercase version

If your regex version supports it, you can use \L, like so in a POSIX shell:

sed -r 's/(^.*)/\L\1/'

In Perl, you can do:

$string =~ s/(some_regex)/lc($1)/ge;

The /e option causes the replacement expression to be interpreted as Perl code to be evaluated, whose return value is used as the final replacement value. lc($x) returns the lowercased version of $x. (Not sure but I assume lc() will handle international characters correctly in recent Perl versions.)

/g means match globally. Omit the g if you only want a single replacement.


If you're using an editor like SublimeText or TextMate1, there's a good chance you may use

\L$1

as your replacement, where $1 refers to something from the regular expression that you put parentheses around. For example2, here's something I used to downcase field names in some SQL, getting everything to the right of the 'as' at the end of any given line. First the "find" regular expression:

(as|AS) ([A-Za-z_]+)\s*,$

and then the replacement expression:

$1 '\L$2',

If you use Vim (or presumably gvim), then you'll want to use \L\1 instead of \L$1, but there's another wrinkle that you'll need to be aware of: Vim reverses the syntax between literal parenthesis characters and escaped parenthesis characters. So to designate a part of the regular expression to be included in the replacement ("captured"), you'll use \( at the beginning and \) at the end. Think of \ as—instead of escaping a special character to make it a literal—marking the beginning of a special character (as with \s, \w, \b and so forth). So it may seem odd if you're not used to it, but it is actually perfectly logical if you think of it in the Vim way.


1 I've tested this in both TextMate and SublimeText and it works as-is, but some editors use \1 instead of $1. Try both and see which your editor uses.

2 I just pulled this regex out of my history. I always tweak regexen while using them, and I can't promise this the final version, so I'm not suggesting it's fit for the purpose described, and especially not with SQL formatted differently from the SQL I was working on, just that it's a specific example of downcasing in regular expressions. YMMV. UAYOR.


Several answers have noted the use of \L. However, \E is also worth knowing about if you use \L.

\L converts everything up to the next \U or \E to lowercase. ... \E turns off case conversion.

(Source: https://www.regular-expressions.info/replacecase.html )

So, suppose you wanted to use rename to lowercase part of some file names like this:

artist_-_album_-_Song_Title_to_be_Lowercased_-_MultiCaseHash.m4a
artist_-_album_-_Another_Song_Title_to_be_Lowercased_-_MultiCaseHash.m4a

you could do something like:

rename -v 's/^(.*_-_)(.*)(_-_.*.m4a)/$1\L$2\E$3/g' *

In Perl, there's

$string =~ tr/[A-Z]/[a-z]/;

Most Regex implementations allow you to pass a callback function when doing a replace, hence you can simply return a lowercase version of the match from the callback.