Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I delete characters between < and > in Perl?

I need to write a Perl script to read in a file, and delete anything inside < >, even if they're on different lines. That is, if the input is:

Hello, world. I <enjoy eating
bagels. They are quite tasty.
I prefer when I ate a bagel to
when I >ate a sandwich. <I also
like >bananas.

I want the output to be:

Hello, world. I ate a sandwich. bananas.

I know how to do this if the text is on 1 line with a regex. But I don't know how to do it with multiple lines. Ultimately I need to be able to conditionally delete parts of a template so I can generate parametrized files for config files. I thought perl would be a good language but I am still getting the hang of it.

Edit: Also need more than 1 instance of <>

like image 230
rlbond Avatar asked Apr 10 '09 14:04

rlbond


People also ask

How do I delete a character in Perl?

We can use the chop() method to remove the last character of a string in Perl. This method removes the last character present in a string. It returns the character that is removed from the string, as the return value.

What does $@ meaning in Perl?

The variables are shown ordered by the "distance" between the subsystem which reported the error and the Perl process...$@ is set if the string to be eval-ed did not compile (this may happen if open or close were imported with bad prototypes), or if Perl code executed during evaluation die()d.

What does =~ do in Perl?

The operator =~ associates the string with the regex match and produces a true value if the regex matched, or false if the regex did not match. In our case, World matches the second word in "Hello World" , so the expression is true.


2 Answers

You may want to check out a Perl module Text::Balanced, part of the core distribution. I think it'll be of help for you. Generally, one wants to avoid regexes to do that sort of thing IF the subject text is likely to have an inner set of delimiters, it can get very messy.

like image 162
Danny Avatar answered Nov 08 '22 13:11

Danny


In Perl:

#! /usr/bin/perl   
use strict;

my $text = <>;
$text =~ s/<[^>]*>//g;
print $text;

The regex substitutes anything starting with a < through the first > (inclusive) and replaces it with nothing. The g is global (more than once).

EDIT: incorporated comments from Hynek and chaos

like image 39
Gene Gotimer Avatar answered Nov 08 '22 13:11

Gene Gotimer