I'm trying to extract some data that match a specified 8 bits address range.
For example, if I want to catch all strings between A0000000 and A0003CFF, since character sequences A000[0000-3CFF] does not work due to character class definition, then I need to write the following pattern:
[0-2][0-F][0-F][0-F]|3[0-B][0-F][0-F]|3C[0-F][0-F]
that will matched all frames between 0000--2FFFF
or 3000-3BFFF
or 3C00 3CFF
.
This pattern is really specific for a given range with a given identical part between my limits (here in my example, it's the 4 common digits A000)
But how can I transform this to a more generic solution if the address range nor the common part are known in advance, but can be freely selected?
Does a simple way exist in regex to catch this ?
Thanks for the help
[Edit] : Thank you all for your answer, it is definitely helpful. As I was guessing, there is no simple way to do it by using regular expression and numeric comparison is required/ easier. Actually, since I got some trouble to explain my first thought, I came with the hex example because it’s a mix of numeric \d and alphanumeric \w characters and I told myself that I could extend the solution to letters [F-Z] if the solution came as a regex. But I did not think at all that you will use the hex function to convert it since I was looking for regex !! I am sorry for my bad explanation.
But your solution is still very useful as I just got something working by replacing the hex conversion function to a custom one, where I convert all characters [0-9]
and [A-Z]
to their ASCII value with some weighting coefficient (eg GV03 must greater than G0V3). Thank again everyone for your help!!
Investigate following code for compliance with your requirements
use strict;
use warnings;
use feature 'say';
my @range = qw(A0000000 A0003CFF);
my $match = qr/(A000([\d[a-f]){4})/i;
@range = map { hex } @range;
while( <DATA> ) {
chomp;
if( /$match/ ) {
my $n = hex($1);
say if $n >= $range[0] && $n <= $range[1];
}
}
__DATA__
This number is A0000000 must be printed
But B0001234 should not be printed
Again A0001CCF must be printed
Once more A0002FFF must be printed
But C0010000 should not be printed
Output
This number is A0000000 must be printed
Again A0001CCF must be printed
Once more A0002FFF must be printed
Convert the strings that represent hexadecimal numbers to decimal number using hex
, then compare them as you would compare numbers. For example:
my $min = 'A0000000';
my $max = 'A0003CFF';
for ( qw( A0000001 A0003DFF ) ) {
print hex $_ >= hex $min && hex $_ <= hex $max
? "$_ is in range"
: "$_ is out of range";
}
Output:
A0000001 is in range
A0003DFF is out of range
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With