I know that I can just use [a-zA-Z0-9]
or use [::alnum::]
regex classes. But I would like to parse a lot of latex macros, which do not allow the '_'
(and/or digits) in macro names and this can become very tedious very quickly, especially because I want to use the \b
character a lot. the question header just mentions underscore, but it is really a more general question.
For example:
my $FOUNDNUM=(s/\\$known\b/\\$xltd{$known}/g);
Is it possible to change the set of characters in the word class once and for all?
I think the answer is no (I could not find a pragma or special variable), but I wanted to double check.
EDIT: Clarification:
my $b=qr/(?<![^a-zA-Z])/;
my $v= "Hi 1 Hi aHi Hia Hi123 Hi_3 _Hi_";
print " In:\t'$v'\n";
print "Desired:\t'** 1 ** aHi Hia **123 **_3 _**_\n\n";
$_ = $v; print "".(s/([^a-zA-Z])Hi([^a-zA-Z])/$1**$2/g)." times to:\t'$_'\n";
$_ = $v; print "".(s/\bHi\b/**/g)." times to:\t'$_'\n";
$_ = $v; print "".(s/${b}Hi${b}/**/g)." times to:\t'$_'\n";
yields
In: 'Hi 1 Hi aHi Hia Hi123 Hi_3 _Hi_'
Desired: '** 1 ** aHi Hia **123 **_3 _**_
4 times to: 'Hi 1 ** aHi Hia **123 **_3 _**_'
2 times to: '** 1 ** aHi Hia Hi123 Hi_3 _Hi_'
2 times to: '** 1 Hi a** Hia Hi123 Hi_3 _Hi_'
the first pattern almost works (except at the start of the string), except it requires me to to use $1 and $2, specify the set of characters in the class.
the second pattern would have worked, except that it has underscore (and digits). nicely, it works at the start of the line.
the third pattern was an attempt to store a regex into a variable to abbreviate meaning, but it obviously failed.
Best Solution comes from CasimiretHippolyte (Thank you!). While it is not possible to replace the '\b', we can define regex's upfront for zero-length assertions, one anchoring at the start and one anchoring at the end.
my $b1=qr/(?<![^\W_\d])/;
my $b2=qr/(?![^\W_\d])/;
my $v= "Hi 1 Hi aHi Hia Hi123 Hi_3 _Hi_ 3Hi";
print " In:\t'$v'\n";
print "Desired:\t'** 1 ** aHi Hia **123 **_3 _**_ 3**\n\n";
$_ = $v; print "".(s/${b1}Hi${b2}/**/g)." times to:\t'$_'\n";
$_ = $v; print "".(s/(?<![^\W_\d])Hi(?![^\W_\d])/**/g)." times to:\t'$_'\n";
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With