I'm looking for a Perl regex that will capitalize any character which is preceded by whitespace (or the first char in the string).
I'm pretty sure there is a simple way to do this, but I don't have my Perl book handy and I don't do this often enough that I've memorized it...
s/(\s\w)/\U$1\E/g;
I originally suggested:
s/\s\w/\U$&\E/g;
but alarm bells were going off at the use of '$&' (even before I read @Manni's comment). It turns out that they're fully justified - using the $&, $` and $' operations cause an overall inefficiency in regexes.
The \E is not critical for this regex; it turns off the 'case-setting' switch \U
in this case or \L
for lower-case.
As noted in the comments, matching the first character of the string requires:
s/((?:^|\s)\w)/\U$1\E/g;
Corrected position of second close parenthesis - thanks, Blixtor.
Depending on your exact problem, this could be more complicated than you think and a simple regex might not work. Have you thought about capitalization inside the word? What if the word starts with punctuation like '...Word'? Are there any exceptions? What about international characters?
It might be better to use a CPAN module like Text::Autoformat or Text::Capitalize where these problems have already been solved.
use Text::Capitalize 0.2;
print capitalize_title($t), "\n";
use Text::Autoformat;
print autoformat{case => "highlight", right=>length($t)}, $t;
It sounds like Text::Autoformat might be more "standard" and I would try that first. Its written by Damian. But Text::Capitalize does a few things that Text::Autoformat doesn't. Here is a comparison.
You can also check out the Perl Cookbook for recipie 1.14 (page 31) on how to use regexps to properly capitalize a title or headline.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With