Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

grep vs Perl. Should I mod this to work, or should I just replace it with Perl code?

I am writing something that will allow users to search through a log of theirs. Currently I have this, where $in{'SEARCH'} is the string they are searching.

open(COMMAND, "grep \"$in{'SEARCH'}\" /home/$palace/palace/logs/$logfile | tail -n $NumLines |");
    $f = <COMMAND>;
    if  ($f) {
        print $Title;
        print "<div id=log>\n";
            do {  print $f."<br>";} while ($f = <COMMAND>);
        print "</div>\n";
    } else {print $Error; }
close(COMMAND);

However I noticed that they could easily fool the script and the grep command to error by using a double quote (") or a backslash. Therefore I added this piece of code:

$in{'SEARCH'} =~ s|\\|\\\\|g;
$in{'SEARCH'} =~ s|"|\Q\"\E|g;

open(COMMAND, "grep \"$in{'SEARCH'}\" /home/$palace/palace/logs/$logfile | tail -n $NumLines |");
    $f = <COMMAND>;
    if  ($f) {
        print $Title;
        print "<div id=log>\n";
            do {  print $f."<br>";} while ($f = <COMMAND>);
        print "</div>\n";
    } else {print $Error; }
close(COMMAND);

However, I am still having problems. The grep command does not like the \ in its search giving errors like

grep "\\" /home/test/palace/logs/chat.log
grep: Trailing backslash

Should I even continue to try and use the grep command, and if so, what is a good Perl function that helps on stripping out weird characters that will help the grep command, say as making " to be \" and etc. Or, instead of messing around should I just use Perl code to accomplish the same thing, even though I've read it wouldn't be as fast as grep?


Update: 7-5-2009 5:20 PM EST


Many people contributed code, especially those attempting to be faster than the system grep. As of right now, its still the fastest. Here is how the benchmarks go:

Using system grep:

Top   of  file: 1 wallclock secs ( 0.00 usr 0.01 sys + 0.13 cusr 0.15 csys = 0.29 CPU)
Bottom of file: 1 wallclock secs ( 0.00 usr 0.00 sys + 0.21 cusr 0.18 csys = 0.39 CPU)

Using Hypneks example (push and shift):

Top   of  file: 4 wallclock secs ( 3.78 usr + 0.19 sys = 3.97 CPU)
Bottom of file: 4 wallclock secs ( 3.86 usr + 0.19 sys = 4.05 CPU)

Using my perl example (using the reverse command):

Top   of  file: 6 wallclock secs ( 4.76 usr + 1.45 sys = 6.21 CPU)
Bottom of file: 5 wallclock secs ( 3.93 usr + 1.44 sys = 5.37 CPU)

Using my File::ReadBackwards:

Top   of  file:11 wallclock secs (11.20 usr + 0.11 sys = 11.31 CPU)
Bottom of file: 1 wallclock secs ( 0.59 usr + 0.01 sys = 0.60 CPU)

Using xcramps example (built in grep):

Top   of  file: 9 wallclock secs ( 8.04 usr + 0.47 sys = 8.51 CPU)
Bottom of file: 8 wallclock secs ( 8.16 usr + 0.43 sys = 8.59 CPU)
like image 469
ParoX Avatar asked Dec 05 '22 05:12

ParoX


2 Answers

I'll agree with Ry4an that going with Perl is probably sensible.

If you want to fix up your code, you need to (should) use single quotes around the arguments that you don't want the shell messing with - not double quotes. As a for instance, either backticks or '$(...)' notation will execute commands inside double quotes, and '${variable}' and such like will be expanded. Dealing with all the notations is too much like hard work; using single quotes is much simpler. Note that embedded single quotes need to be replaced by the sequence "'\''" (that was fun!); the double quotes simply surround the four characters that you replace a single quote with (assuming the string as a whole is embedded in single quotes). The first quote closes the current quoted string; the backslash, single quote means a literal single quote; the last single quote starts a new single quoted string. You don't need to escape any other characters.


Taking your code:

my $search = $in{SEARCH};
$search =~ s/'/'\\''/g;

open(COMMAND, "grep '$search' /home/$palace/palace/logs/$logfile | tail -n $NumLines |");
$f = <COMMAND>;
if  ($f)
{
    print $Title;
    print "<div id=log>\n";
    do { print $f."<br>";} while ($f = <COMMAND>);
    print "</div>\n";
}
else
{
    print $Error;
}
close(COMMAND);
like image 146
Jonathan Leffler Avatar answered Dec 28 '22 05:12

Jonathan Leffler


Definitely just use Perl. Keeping all the work in a single process will speed things up, and invoking a subcommand with user specified args is just begging for exploits. If you do decide to stick with the subcommand quotemeta might be worth looking at.

like image 23
Ry4an Brase Avatar answered Dec 28 '22 07:12

Ry4an Brase