Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling more than function calls in PHP

I've started on my first somewhat serious programming project with PHP and am having a hard time getting good profiling information. I have used xdebug and cachegrind to profile the code, but it assumes that a function's "self" execution time is the total time the function takes to execute minus the aggregated time of any function(s) it calls.

While this is correct in essence, it does not help as much as it could when I'm trying to optimize the code. For example, I have a simple function with two if statements which returns an array with two values and it takes 26% of the total execution time according to xdebug. It uses array_key_exists, and I can see that the execution time of those function calls is around 1%. What I cannot see is how other 25% of the execution time is accounted for with the rest of the code. How long did it take for the if statements, getting the pair of values, putting those values into an array, and returning the array?

Is there some xdebug flag which I missed in the documentation? Or is there a profiling tool for PHP which will profile variable assignments, operators, the construction of the basic data types, and other non-functional calls? Preferably, something that does not assume that the script is available via a web server.

An example of something along the lines of what I'm looking for is nytprof for perl. Note that it profiles every single line of code within the function (besides the last push, which never gets executed).

like image 939
Ogo Avatar asked Mar 26 '11 16:03

Ogo


2 Answers

If you want to measure per line statistics I'd suggest using a combination of declare (for tick), register_tick_function, microtime() and debug_backtrace. This lets you call arbitrary functions after every line of execution w/o altering the PHP source.

For example:

declare(ticks=1);
register_tick_function( function() {
    print "call\n";
} );

print "hi\n";
print "bye\n";

Actually runs like this:

declare(ticks=1);
register_tick_function( function() {
    //print "call\n";
} ); print "call\n";

print "hi\n"; print "call\n";
print "bye\n"; print "call\n";

Unfortunately your on your own if you go this route. You'll have to log the data yourself, analyze and then figure out an way of correlating it with the source. It's not impossible but as far as I know these tools don't already exist for PHP.

Proof-of-concept:

function _line_benchmark( $reset = FALSE ) {
    $start_time = microtime(TRUE);
    static $lastFrame = NULL;

    if( $reset ) {
        $lastFrame = NULL;
        return;
    }

    $backtrace = debug_backtrace();
    $last_backtrace = $backtrace[count($backtrace) -1];
    $file = $last_backtrace["file"];
    $line = $last_backtrace["line"];
    $source = file_get_contents( $file );
    $source_lines = preg_split( "/((\r(?!\n))|((?<!\r)\n)|(\r\n))/", $source );

    if( ! is_null( $lastFrame ) ) {
        if( $lastFrame["line"]+1 <= $line ) {
            print "{\n";
            for( $i = $lastFrame["line"]+1; $i <= $line; $i++ ) {
                print "#".($i-1)."\t".$source_lines[$i-1]."\n";
            }
            print "} - ".($start_time-$lastFrame["time"])."\n\n";
        }
    }

    $lastFrame = array(
        "time" => microtime(TRUE),
        "file" => $file,
        "line" => $line
    );
}

function line_benchmark_start() {
    _line_benchmark( TRUE ); //reset benchmark functions

    declare(ticks=1);
    register_tick_function( "_line_benchmark" );
}

function line_benchmark_stop() {
    unregister_tick_function( "_line_benchmark" );
}


line_benchmark_start();

usleep( 100 );
usleep( 1000 );
usleep( 10000 );
usleep( 100000 );
usleep( 1000000 );

line_benchmark_stop();

Output

{
#48     
#49     usleep( 100 );
} - 0.000154972076416

{
#50     usleep( 1000 );
} - 0.00112199783325

{
#51     usleep( 10000 );
} - 0.0101318359375

{
#52     usleep( 100000 );
} - 0.0998418331146

{
#53     usleep( 1000000 );
} - 0.999831914902
like image 128
Kendall Hopkins Avatar answered Oct 05 '22 18:10

Kendall Hopkins


When you're running the program under xdebug, you should be able to break into it's execution manually at a random time, and then display the call stack. If you do that several times, any code that's responsible for significant time will appear on those stack samples. That's the random-pause technique.

It is a bit of a brain-shift. Things like self-time, measuring function execution time, or counting calls, for the purpose of finding problems, are indirect information at best. Rather, it says anything that's actually costing much time can't escape your attention if you just drop in on it and take a look. Anything you see it "doing" (and that means any line of code on the stack that you catch more than once) is something that will give a good speedup if you can avoid doing it. Precision of measurement is irrelevant. You've found it, so you can fix it, and afterwards measure the speedup.

like image 29
Mike Dunlavey Avatar answered Oct 05 '22 18:10

Mike Dunlavey