Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Randomizing text between delimiters

I have this simple input

I have {red;green;orange} fruit and cup of {tea;coffee;juice}

I use Perl to identify patterns between two external brace delimiters { and }, and randomize the fields inside with the internal delimiter ;.

I'm getting this output

I have green fruit and cup of coffee

This is my working Perl script

perl -plE 's!\{(.*?)\}!@x=split/;/,$1;$x[rand@x]!ge' <<< 'I have {red;green;orange} fruit and cup of {tea;coffee;juice}'

My task is to process this input format

I have { {red;green;orange} fruit ; cup of {tea;coffee;juice} } and {nice;fresh} {sandwich;burger}.

As I understood, the script should skip external closing braces { ... } in the first text part, which has text inside with opening and closing brackets:

{ {red;green;orange} fruit ; cup of {tea;coffee;juice} }

It should choose a random part, like this

{red;green;orange} fruit

or

cup of {tea;coffee;juice}

Then it goes deeper:

green fruit

After all text is processed, the result may be any of the following

I have red fruit and fresh burger.
I have cup of tea and nice sandwich
I have green fruit and nice burger.
I have cup of coffee and fresh burger.

The script should parse and randomize the next text too. For example

This {beautiful;perfect} {image;photography}, captured with the { {NASA;ESA} Hubble Telescope ; {NASA;ESA} Hubble Space Telescope} }, is the {largest;sharpest} image ever taken of the Andromeda galaxy { {— otherwise known as M31;— known as M31}; [empty here] }.
This is a cropped version of the full image and has 1.5 billion pixels. { You would need more than {600;700;800} HD television screens to display the whole image. ; If you want to display the whole image, you need to download more than {1;2} Tb. traffic and use 800 HD displays }

An example output could be

This beautiful image, captured with the NASA Hubble Telescope, is the
sharpest image ever taken of the Andromeda galaxy — otherwise known as
M31.
This is a cropped version of the full image and has 1.5 billion
pixels. You would need more than 700 HD television screens to display
the whole image.
like image 294
kempinski Avatar asked Dec 24 '15 13:12

kempinski


2 Answers

Going non-greedy is a good thought, but doesn't quite do the trick. And you can add a loop:

perl -plE 'while(s!\{([^{}]*)\}!@x=split/;/,$1;$x[rand@x]!ge){}'

Notice that your sample input has unmatched braces, so this appears to output a spurious '}'

like image 194
William Pursell Avatar answered Sep 20 '22 00:09

William Pursell


Nice challenge. What you need to do is to find a set of braces without interior braces, and pick a random item from in there. You need to do that globally. That will replace just the "level 1" braces. You need to loop over the string until no more matches are found.

use v5.18;
use strict;
use warnings;

sub rand_sentence {
    my $copy = shift;
    1 while $copy =~ s{ \{ ([^{}]+) \} } 
                      { my @words = split /;/, $1; $words[rand @words] }xsge;
    return $copy;
}

my $str = 'I have { {red;green;orange} fruit ; cup of {tea;coffee;juice} } and {nice;fresh} {sandwich;burger}.';
say rand_sentence($str);
say '';

$str = <<'END';
This {beautiful;perfect} {image;photography}, captured with the { {NASA;ESA}
Hubble Telescope ; {NASA;ESA} Hubble Space Telescope }, is the
{largest;sharpest} image ever taken of the Andromeda galaxy { {— otherwise
known as M31;— known as M31}; [empty here] }. This is a cropped version of the
full image and has 1.5 billion pixels. { You would need more than {600;700;800}
HD television screens to display the whole image. ; If you want to display the
whole image, you need to download more than {1;2} Tb.  traffic and use 800 HD
displays }
END

say rand_sentence($str);

sample output

I have  orange fruit  and fresh sandwich.

This beautiful photography, captured with the  ESA Hubble Space Telescope , is the
largest image ever taken of the Andromeda galaxy  — otherwise
known as M31. This is a cropped version of the
full image and has 1.5 billion pixels.  If you want to display the
whole image, you need to download more than 1 Tb.  traffic and use 800 HD
displays
like image 27
glenn jackman Avatar answered Sep 21 '22 00:09

glenn jackman