I am using pdfgrep
to search all appearances of a keyword in a PDF Document.
Now, I want to do this via PHP so I can use this in my Web Site.
However, when I run:
$output = shell_exec("pdfgrep -i $keyword $file");
$var_dump($output);
Where $keyword
is the keyword and $file
is the file, I don't get the entire output.
The PDF is made up of a table of product codes, product names, and product prices.
When I execute the command via Terminal, I'm able to see the entire row of data:
product code 1 product name with keyword substring corresponding price
product code 2 product name with keyword substring corresponding price
product code 3 product name with keyword substring corresponding price
However, when I ran it via PHP, I got something like:
name with keyword substring with keyword substring product code 1
product name with keyword substring product name with keyword substring
corresponding price
It just does not get all the data. It doesn't always get the product code and the price, and there has been a lot of instances where it doesn't get the entire product name as well.
I view the output via browser and put in header('Content-Type: text/plain');
but it only prettifies the output, the data is still incomplete.
I've tried to run the exact same shell script via Python3.6 and that gave me the output I desired.
Now, I've tried to run the same Python script via PHP but I still get the same broken output.
I've tried to run a keyword that I know would return a shorter output, but I still don't get the entire data line that I need.
Is there any way to reliably get all the data thrown by the shell_exec()
command?
Are there alternatives available such as a different command, or running a Python script from a server (since the Python script doesn't have any issues anyway).
The shell_exec() function is an inbuilt function in PHP which is used to execute the commands via shell and return the complete output as a string. The shell_exec is an alias for the backtick operator, for those used to *nix.
You can execute linux commands within a php script - all you have to do is put the command line in brackits (`). And also concentrate on exec() , this and shell_exec() ..
Updated: 05/04/2019 by Computer Hope. On Unix-like operating systems, exec is a builtin command of the Bash shell. It lets you execute a command that completely replaces the current process. The current shell process is destroyed, and entirely replaced by the command you specify.
I don't know how pdfgrep works but maybe it mixes stdout and stderr? Either way, you could use a construction like this, where you capture the output stream into an output buffer, optionally also mixing stderr into stdout:
$mixStdErrIntoStdOut = false;
ob_start();
$exitCode = 0;
if ($mixStdErrIntoStdOut)
{
system("pdfgrep -i $keyword $file 2>&1", &$exitCode);
} else {
system("pdfgrep -i $keyword $file", &$exitCode);
}
$output = ob_get_clean();
var_dump($output);
There are number of ways how you can execute the process and gather the output. If you can consistently repeat the problem, you may try other process execution methods:
1) exec($command, &$output)
$output = [];
exec($command, $output);
this should push all of the output, line-by-line, in your $output array, that has to be instantiated before calling this method.
2) passthru($command)
this would pass back into the output buffer all the output of the command. so to use this you need to use output buffer:
ob_start();
passthru($command);
$contents = ob_get_contents();
ob_end_clean();
3) popen($command, "r");
$output = "";
$handle = popen($command, "r");
while (!feof($handle)){
$output .= fread($handle, 4096);
}
Let me know what you get by calling each of the methods.
Also, make sure to check stderror for errors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With