Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Text version-control in PHP with difference highlight

If you ever edited a question right here on StackOverflow, you have probably noticed that it keeps track of what exact changed were applied to a question. It is displayed in a form by highlighting red portion of a text which were removed and green which were added since at a particular edit. My question is how to implement such a system myself. I am trying to make a custom CMS in PHP with MySQL and this seems like a very cool feature to tackle.

Any advice or maybe there are open source libraries which can do this already and I can just analyze how they do it?

Demonstration

Here I added some text which will show green if you click on the edit link to see the changes.

like image 558
miki725 Avatar asked Mar 18 '11 21:03

miki725


2 Answers

/*
    Paul's Simple Diff Algorithm v 0.1
    (C) Paul Butler 2007 <http://www.paulbutler.org/>
    May be used and distributed under the zlib/libpng license.

    This code is intended for learning purposes; it was written with short
    code taking priority over performance. It could be used in a practical
    application, but there are a few ways it could be optimized.

    Given two arrays, the function diff will return an array of the changes.
    I won't describe the format of the array, but it will be obvious
    if you use print_r() on the result of a diff on some test data.

    htmlDiff is a wrapper for the diff command, it takes two strings and
    returns the differences in HTML. The tags used are <ins> and <del>,
    which can easily be styled with CSS.
*/

function diff($old, $new){
    $maxlen = 0;
    foreach($old as $oindex => $ovalue){
        $nkeys = array_keys($new, $ovalue);
        foreach($nkeys as $nindex){
            $matrix[$oindex][$nindex] = isset($matrix[$oindex - 1][$nindex - 1]) ?
                $matrix[$oindex - 1][$nindex - 1] + 1 : 1;
            if($matrix[$oindex][$nindex] > $maxlen){
                $maxlen = $matrix[$oindex][$nindex];
                $omax = $oindex + 1 - $maxlen;
                $nmax = $nindex + 1 - $maxlen;
            }
        }
    }
    if($maxlen == 0) return array(array('d'=>$old, 'i'=>$new));
    return array_merge(
        diff(array_slice($old, 0, $omax), array_slice($new, 0, $nmax)),
        array_slice($new, $nmax, $maxlen),
        diff(array_slice($old, $omax + $maxlen), array_slice($new, $nmax + $maxlen)));
}

function htmlDiff($old, $new){
    $ret = '';
    $diff = diff(explode(' ', $old), explode(' ', $new));
    foreach($diff as $k){
        if(is_array($k))
            $ret .= (!empty($k['d'])?'<del>'.implode(' ',$k['d']).'</del> ':'').
                (!empty($k['i'])?'<ins>'.implode(' ',$k['i']).'</ins> ':'');
        else $ret .= $k . ' ';
    }
    return $ret;
}

I'm pretty sure I changed something in it. Other than that, it should work perfectly.

Example of use:

$a='abc defg h 12345';
$b='acb defg ikl 66 123 456';
echo htmlDiff($a,$b);

And the result:

<del>abc</del> <ins>acb</ins> defg <del>h 12345</del> <ins>ikl 66 123 456</ins> 

And visibly:

abc acb defg h 12345 ikl 66 123 456

like image 74
Christian Avatar answered Oct 15 '22 05:10

Christian


The PEAR Text_Diff component might be helpful, here : it allows one to perform, and render, diffs between two text data.


If you take a look at the Renderer examples page, the Inline one shoud do what you want : in the given example, it :

  • surrounds delete text by <del> and </del>
  • surround added text by <ins> and </ins>

If you use a bit of CSS to style those, you should be able to get what you're asking for.

like image 30
Pascal MARTIN Avatar answered Oct 15 '22 05:10

Pascal MARTIN