Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to match irreducible fractions

Tags:

How can I match irreducible fractions with regex?

For example, 23/25, 3/4, 5/2, 100/101, etc.

First of all, I have no idea about the gcd-algorithm realization in regex.

Update for all of you who's answering like "You are using the wrong tool":

Yeah, guys, I'm realizing what regex is normally used for. It's okay. But that this question is weird is kind of its whole point.

Updated 2: The idea is to find a regex that could be helpful in a situation like:

$> echo "1/2" | grep -P regex 1/2 $> echo "2/4" | grep -P regex 

So, the regex should be only a string, without using any scripts and variables. Only regex.

Actually, I already know some regex which match reducible fractions written in the unary number system.

$> echo "11/1111" | grep -P '^1/1+$|(11+)+\1+/\1+$' 11/1111 

So the thing is to convert from decimal to unary number system in regex, but I don't know how.

like image 680
ДМИТРИЙ МАЛИКОВ Avatar asked Apr 15 '11 14:04

ДМИТРИЙ МАЛИКОВ


2 Answers

UPDATE

Since the poster requested a single regex that matches against strings like "36/270", but says it doesn’t matter how legible it is, that regex is:

my $reducible_rx = qr{^(\d+)/(\d+)$(?(?{(1x$1."/".1x$2)=~m{^(?|1+/(1)|(11+)\1*/\1+)$}})|^)}; 

But, if like me, you believe that an illegible regex is absolutely unacceptable, you will write that more legibly as:

my $reducible_rx = qr{   # first match a fraction:     ^ ( \d+ ) / ( \d+ ) $   # now for the hard part:     (?(?{ ( 1 x $1 . "/" . 1 x $2 ) =~ m{                 ^                 (?|    1+      / (1)  # trivial case: GCD=1                   |  (11+) \1* / \1+  # find the GCD                 )                  $             }x         })           # more portable version of (*PASS)      | ^  # more portable version of (*FAIL)      ) }x; 

You can improve maintainability by splitting out the version that matches the unary version from the one that matches the decimal version like this:

# this one assumes unary notation my $unary_rx = qr{     ^      (?|   1+       / (1)       | (11+)  \1* / \1+      )      $ }x;  # this one assumes decimal notation and converts internally my $decimal_rx = qr{   # first match a fraction:     ^ ( \d+ ) / ( \d+ ) $    # now for the hard part:     (?(?{( 1 x $1 . "/" . 1 x $2 ) =~ $unary_rx})           # more portable version of (*PASS)      | ^  # more portable version of (*FAIL)       ) }x; 

Isn’t that much easier by separating it into two named regexes? That would now make $reducible_rx the same as $decimal_rx, but the unary version is its own thing. That’s how I would do it, but the original poster wanted a single regex, so you’d have to interpolate the nested one for that as I first present above.

Either way, you can plug into the test harness below using:

    if ($frac =~ $reducible_rx) {         cmp_ok($frac, "ne", reduce($i, $j), "$i/$j is $test");     } else {         cmp_ok($frac, "eq", reduce($i, $j), "$i/$j is $test");     } 

And you will see that it is a correct regex that passes all tests, and does so moreover using a single regex, wherefore having now passed all requirements of the original question, I declare Qᴜᴏᴅ ᴇʀᴀᴛ ᴅᴇᴍᴏɴsᴛʀᴀɴᴅᴜᴍ: “Quit, enough done.” 😇

And you’re welcome.


The answer is to match the regex ^(?|1+/(1)|(11+)\1*/\1+)$ against the fraction once it has been converted from decimal to unary notation, at which point the greatest common factor will be found in $1 on a match; otherwise they are coprimes. If you are using Perl 5.14 or better, you can even do this in one step:

use 5.014; my $reg  = qr{^(?|1+/(1)|(11+)\1*/\1+)$}; my $frac = "36/270";  # for example if ($frac =~ s/(\d+)/1 x $1/reg =~ /$reg/) {      say "$frac can be reduced by ", length $1; } else {     say "$frac is irreducible"; } 

Which will correctly report that:

36/270 can be reduced by 18 

(And of course, reducing by 1 means there is no longer a denominator.)

If you wanted to have a bit of punning fun with your readers, you could even do it this way:

use 5.014; my $regex = qr{^(?|1+/(1)|(11+)\1*/\1+)$}; my $frac  = "36/270";  # for example if ($frac =~ s/(\d+)/"1 x $1"/regex =~ /$regex/) {     say "$frac can be reduced by ", length $1; } else {     say "$frac is irreducible"; } 

Here is the code that demonstrates how to do this. Furthermore, it constructs a test suite that tests its algorithm using all (positive) numerators and denominators up to its argument, or 30 by default. To run it under a test harness, put it in a file named coprimes and do this:

$ perl -MTest::Harness -e 'runtests("coprimes")' coprimes .. ok        All tests successful. Files=1, Tests=900,  1 wallclock secs ( 0.13 usr  0.02 sys +  0.33 cusr  0.02 csys =  0.50 CPU) Result: PASS 

Here is an example of its output when run without the test harness:

$ perl coprimes 10 1..100 ok 1 - 1/1 is 1 ok 2 - 1/2 is 1/2 ok 3 - 1/3 is 1/3 ok 4 - 1/4 is 1/4 ok 5 - 1/5 is 1/5 ok 6 - 1/6 is 1/6 ok 7 - 1/7 is 1/7 ok 8 - 1/8 is 1/8 ok 9 - 1/9 is 1/9 ok 10 - 1/10 is 1/10 ok 11 - 2/1 is 2 ok 12 - 2/2 is 1 ok 13 - 2/3 is 2/3 ok 14 - 2/4 is 1/2 ok 15 - 2/5 is 2/5 ok 16 - 2/6 is 1/3 ok 17 - 2/7 is 2/7 ok 18 - 2/8 is 1/4 ok 19 - 2/9 is 2/9 ok 20 - 2/10 is 1/5 ok 21 - 3/1 is 3 ok 22 - 3/2 is 3/2 ok 23 - 3/3 is 1 ok 24 - 3/4 is 3/4 ok 25 - 3/5 is 3/5 ok 26 - 3/6 is 1/2 ok 27 - 3/7 is 3/7 ok 28 - 3/8 is 3/8 ok 29 - 3/9 is 1/3 ok 30 - 3/10 is 3/10 ok 31 - 4/1 is 4 ok 32 - 4/2 is 2 ok 33 - 4/3 is 4/3 ok 34 - 4/4 is 1 ok 35 - 4/5 is 4/5 ok 36 - 4/6 is 2/3 ok 37 - 4/7 is 4/7 ok 38 - 4/8 is 1/2 ok 39 - 4/9 is 4/9 ok 40 - 4/10 is 2/5 ok 41 - 5/1 is 5 ok 42 - 5/2 is 5/2 ok 43 - 5/3 is 5/3 ok 44 - 5/4 is 5/4 ok 45 - 5/5 is 1 ok 46 - 5/6 is 5/6 ok 47 - 5/7 is 5/7 ok 48 - 5/8 is 5/8 ok 49 - 5/9 is 5/9 ok 50 - 5/10 is 1/2 ok 51 - 6/1 is 6 ok 52 - 6/2 is 3 ok 53 - 6/3 is 2 ok 54 - 6/4 is 3/2 ok 55 - 6/5 is 6/5 ok 56 - 6/6 is 1 ok 57 - 6/7 is 6/7 ok 58 - 6/8 is 3/4 ok 59 - 6/9 is 2/3 ok 60 - 6/10 is 3/5 ok 61 - 7/1 is 7 ok 62 - 7/2 is 7/2 ok 63 - 7/3 is 7/3 ok 64 - 7/4 is 7/4 ok 65 - 7/5 is 7/5 ok 66 - 7/6 is 7/6 ok 67 - 7/7 is 1 ok 68 - 7/8 is 7/8 ok 69 - 7/9 is 7/9 ok 70 - 7/10 is 7/10 ok 71 - 8/1 is 8 ok 72 - 8/2 is 4 ok 73 - 8/3 is 8/3 ok 74 - 8/4 is 2 ok 75 - 8/5 is 8/5 ok 76 - 8/6 is 4/3 ok 77 - 8/7 is 8/7 ok 78 - 8/8 is 1 ok 79 - 8/9 is 8/9 ok 80 - 8/10 is 4/5 ok 81 - 9/1 is 9 ok 82 - 9/2 is 9/2 ok 83 - 9/3 is 3 ok 84 - 9/4 is 9/4 ok 85 - 9/5 is 9/5 ok 86 - 9/6 is 3/2 ok 87 - 9/7 is 9/7 ok 88 - 9/8 is 9/8 ok 89 - 9/9 is 1 ok 90 - 9/10 is 9/10 ok 91 - 10/1 is 10 ok 92 - 10/2 is 5 ok 93 - 10/3 is 10/3 ok 94 - 10/4 is 5/2 ok 95 - 10/5 is 2 ok 96 - 10/6 is 5/3 ok 97 - 10/7 is 10/7 ok 98 - 10/8 is 5/4 ok 99 - 10/9 is 10/9 ok 100 - 10/10 is 1 

And here is the program:

#!/usr/bin/env perl # # coprimes - test suite to use unary coprimality algorithm #  # Tom Christiansen <[email protected]> # Sun Apr 17 12:18:19 MDT 2011  use strict; use warnings;  my $DEFAULT = 2*3*5; my $max = @ARGV ? shift : $DEFAULT;  use Test::More; plan tests => $max ** 2;  my $rx = qr{     ^     (?|   1+       / (1)       | (11+)  \1* / \1+     )     $ }x;  for my $i ( 1 .. $max ) {     for my $j ( 1 .. $max ) {         my $test;         if (((1 x $i) . "/" . (1 x $j)) =~ /$rx/) {             my $cf = length($1);             $test = $i / $cf;             $test .= "/" . $j/$cf unless $j/$cf == 1;         } else {             $test = "$i/$j";         }         cmp_ok($test, "eq", reduce($i, $j), "$i/$j is $test");     } }  sub reduce {     my ($a, $b) = @_;     use Math::BigRat;     my $f = new Math::BigRat "$a/$b";     return "$f"; } 
like image 71
tchrist Avatar answered Sep 19 '22 07:09

tchrist


Nope it cannot be done. Like a good computer scientist I will ignore the specifics of the tool regex and assume you are asking if there is a regular expression. I do not have enough knowledge about regex's features to ensure it is restricted to regular expressions. That caveat aside, on with the show.

Rewording this we get:

Let L be the language {"a/b"| where a and b are natural numbers encoded in a radix r and a and b are coprime}. Is L regular?

Assume such a language is regular. Then there exists a DFA that can decide membership in L. Let N be the number of states of such a DFA. There are an infinite number of primes. As the number of primes is infinite, there are arbitrarily many primes greater than the largest number encodable in N digits in the radix r. (Note: The largest number is clearly r raised to the power of N. I am using this weird wording to show how to accommodate unary.) Select N+1 primes that are greater than this number. All of these numbers are encoded using at least N+1 digits (in the radix r). Enumerate these primes p₀ to pₙ. Let sᵢ be the state of the pᵢ is in immediately after reading the /. By the pigeon hole principle, there are N states and N+1 sᵢ states so there exists at least one pair of indexes (j,k) such that sⱼ = sₖ. So starting from the initial state of the DFA, inputs pₖ/ and pⱼ/ lead to the same state sⱼ (or sₖ) and pⱼ and pₖ are distinct primes.

L must accept all pairs of distinct primes p/q as they are coprime and reject all primes divided by themselves p/p as p is not coprime to p. Now the language accepts pⱼ = pₖ so there is a sequence of states from sⱼ using the string pₖ to an accepting state, call this sequence β. Let α be the sequence of states reading pₖ starting from the initial state. The sequence of states for the DFA starting at the initial state for the string pₖ/pₖ must be the same as α followed by β. This sequence starts in an initial state, goes to sₖ (by reading the input pₖ), and reaches an accepting state by reading pₖ. The DFA accepts pₖ/pₖ and pₖ/pₖ is in L. pₖ is not coprime to pₖ, and therefore pₖ/pₖ is not in L. Contradiction. Therefore the language L is irregular, or no regular expression exists.

like image 36
Tim Avatar answered Sep 20 '22 07:09

Tim