I have sentences including letters, integers, and decimals.
Example:
There are 1.6mm, 2.1mmcycst.
There are many about 3mm cysts.
There are 2 cysts about 4~5mm.
("2.1mm cyst" or "2.1 mm scysts" is the accurate sentence, but our data is "2.1mmcycst")
From these sentences, I want to extract numeri's. For example,
1.6 and 2.1
3
4~5
I'm not familiar with regular expressions, and I cannot pick up only numeri's including decimals or other relative signs (eg., "~").
Here is the code:
#!/usr/bin/perl
my $qwe = "There are 1.6mm, 2.1mmcycst.";
print "$qwe\n";
if($qwe =~ /\d+(\.\d)?\d*/){
print "$&\n";
}
From the script, I got below output:
1.6
I am expecting 1.6 and 2.1.
How can I change my regex here to match multiple patterns in single line?
I use macOS 10.14.5 and perl v5.18.4.
Do not reinvent the wheel. If a task seems common to you, it is likely that there is a Perl module for that. Regexp::Common can be used for matching common regular expressions, including numbers of various kinds. For example, your sample input can be extended with more complex examples of numbers, all of which can be parsed as shown below:
Create the input:
cat > in.txt <<EOF
There are 1.6mm, 2.1mmcycst.
There are many about 3mm cysts.
There are 2 cysts about 4~5mm.
The collection has 1.23E6 frozen cysts, stored at -70.5C, with cysts ranging in size from 1e-3m to 5.12E-3
EOF
Parse and print the real numbers:
perl -MRegexp::Common -lne 'print join " ", /($RE{num}{real})/g;' in.txt
Output:
1.6 2.1
3
2 4 5
1.23E6 -70.5 1e-3 5.12E-3
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-MRegexp::Common : same as BEGIN { use Regexp::Common; }.
/($RE{num}{real})/g : Capture all real numbers in the input line $_. Parenthesis mean capture. /.../g means match multiple times. In the LIST context, imposed by join, this returns the list of all matches. These matches are then printed.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
Note: you need to install Regexp::Common Perl module - it is not part of the standard Perl library.
Use the /g option on your match operator, to get all matches. Then replace for with while to iterate across them.
#!/usr/bin/perl
my $qwe = "There are 1.6mm, 2.1mmcycst.";
print "$qwe\n";
while ($qwe =~ /\d+(\.\d)?\d*/g) {
print "$&\n";
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With