Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get string between - Find all occurrences PHP

Tags:

php

I found this function which finds data between two strings of text, html or whatever.

How can it be changed so it will find all occurrences? Every data between every occurrence of $start [some-random-data] $end. I want all the [some-random-data] of the document (It will always be different data).

function getStringBetween($string, $start, $end) {
    $string = " ".$string;
    $ini = strpos($string,$start);
    if ($ini == 0) return "";
    $ini += strlen($start);
    $len = strpos($string,$end,$ini) - $ini;
    return substr($string,$ini,$len);
}
like image 482
user3778578 Avatar asked Nov 22 '14 14:11

user3778578


5 Answers

One possible approach:

function getContents($str, $startDelimiter, $endDelimiter) {
  $contents = array();
  $startDelimiterLength = strlen($startDelimiter);
  $endDelimiterLength = strlen($endDelimiter);
  $startFrom = $contentStart = $contentEnd = 0;
  while (false !== ($contentStart = strpos($str, $startDelimiter, $startFrom))) {
    $contentStart += $startDelimiterLength;
    $contentEnd = strpos($str, $endDelimiter, $contentStart);
    if (false === $contentEnd) {
      break;
    }
    $contents[] = substr($str, $contentStart, $contentEnd - $contentStart);
    $startFrom = $contentEnd + $endDelimiterLength;
  }

  return $contents;
}

Usage:

$sample = '<start>One<end>aaa<start>TwoTwo<end>Three<start>Four<end><start>Five<end>';
print_r( getContents($sample, '<start>', '<end>') );
/*
Array
(
    [0] => One
    [1] => TwoTwo
    [2] => Four
    [3] => Five
)
*/ 

Demo.

like image 58
raina77ow Avatar answered Oct 05 '22 02:10

raina77ow


You can do this using regex:

function getStringsBetween($string, $start, $end)
{
    $pattern = sprintf(
        '/%s(.*?)%s/',
        preg_quote($start),
        preg_quote($end)
    );
    preg_match_all($pattern, $string, $matches);

    return $matches[1];
}
like image 23
ifm Avatar answered Oct 05 '22 02:10

ifm


I love to use explode to get string between two string. this function also works for multiple occurrences.

function GetIn($str,$start,$end){
    $p1 = explode($start,$str);
    for($i=1;$i<count($p1);$i++){
        $p2 = explode($end,$p1[$i]);
        $p[] = $p2[0];
    }
    return $p;
}
like image 33
Shamim Avatar answered Oct 05 '22 03:10

Shamim


I needed to find all these occurences between specific first and last tag and change them somehow and get back changed string.

So I added this small code to raina77ow approach after the function.

        $sample = '<start>One<end> aaa <start>TwoTwo<end> Three <start>Four<end> aaaaa <start>Five<end>';
        $sample_temp = getContents($sample, '<start>', '<end>');
        $i = 1;
        foreach($sample_temp as $value) {
            $value2 = $value.'-'.$i; //there you can change the variable
            $sample=str_replace('<start>'.$value.'<end>',$value2,$sample);
            $i = ++$i;
        }
        echo $sample;

Now output sample has deleted tags and all strings between them has added number like this:

One-1 aaa TwoTwo-2 Three Four-3 aaaaa Five-4

But you can do whatever else with them. Maybe could be helpful for someone.

like image 45
Grows Avatar answered Oct 05 '22 02:10

Grows


There was some great sollutions here, however not perfekt for extracting parts of code from say HTML which was my problem right now, as I need to get script blocks out of the HTML before compressing the HTML. So building on @raina77ow original sollution, expanded by @Cas Tuyn I get this one:

$test_strings = [
    '0<p>a</p>1<p>b</p>2<p>c</p>3',
    '0<p>a</p>1<p>b</p>2<p>c</p>',
    '<p>a</p>1<p>b</p>2<p>c</p>3',
    '<p>a</p>1<p>b</p>2<p>c</p>',
    '<p></p>1<p>b'
];

/**
* Seperate a block of code by sub blocks. Example, removing all <script>...<script> tags from HTML kode
* 
* @param string $str, text block
* @param string $startDelimiter, string to match for start of block to be extracted
* @param string $endDelimiter, string to match for ending the block to be extracted
* @return array [all full blocks, whats left of string]
*/
function getDelimitedStrings($str, $startDelimiter, $endDelimiter) {
    $contents = array();
    $startDelimiterLength = strlen($startDelimiter);
    $endDelimiterLength = strlen($endDelimiter);
    $startFrom = $contentStart = $contentEnd = $outStart = $outEnd = 0;
    while (false !== ($contentStart = strpos($str, $startDelimiter, $startFrom))) {
        $contentStart += $startDelimiterLength;
        $contentEnd = strpos($str, $endDelimiter, $contentStart);
        $outEnd = $contentStart - 1;
        if (false === $contentEnd) {
            break;
        }
        $contents['in'][] = substr($str, ($contentStart-$startDelimiterLength), ($contentEnd + ($startDelimiterLength*2) +1) - $contentStart);
        if( $outStart ){
            $contents['out'][] = substr($str, ($outStart+$startDelimiterLength+1), $outEnd - $outStart - ($startDelimiterLength*2));
        } else if( ($outEnd - $outStart - ($startDelimiterLength-1)) > 0 ){
            $contents['out'][] = substr($str, $outStart, $outEnd - $outStart - ($startDelimiterLength-1));
        }
        $startFrom = $contentEnd + $endDelimiterLength;
        $startFrom = $contentEnd;
        $outStart = $startFrom;
    }
    $total_length = strlen($str);
    $current_position = $outStart + $startDelimiterLength + 1;
    if( $current_position < $total_length )
        $contents['out'][] = substr($str, $current_position);

    return $contents;
}

foreach($test_strings AS $string){
    var_dump( getDelimitedStrings($string, '<p>', '</p>') );
}

This will extract all

wlements with the possible innerHTML aswell, giving this result:

array (size=2)
'in' => array (size=3)
    0 => string '<p>a</p>' (length=8)
    1 => string '<p>b</p>' (length=8)
    2 => string '<p>c</p>' (length=8)
'out' => array (size=4)
    0 => string '0' (length=1)
    1 => string '1' (length=1)
    2 => string '2' (length=1)
    3 => string '3' (length=1)

array (size=2)
'in' => array (size=3)
    0 => string '<p>a</p>' (length=8)
    1 => string '<p>b</p>' (length=8)
    2 => string '<p>c</p>' (length=8)
'out' => array (size=3)
    0 => string '0' (length=1)
    1 => string '1' (length=1)
    2 => string '2' (length=1)

array (size=2)
'in' => array (size=3)
    0 => string '<p>a</p>' (length=8)
    1 => string '<p>b</p>' (length=8)
    2 => string '<p>c</p>' (length=8)
'out' => array (size=3)
    0 => string '1' (length=1)
    1 => string '2' (length=1)
    2 => string '3' (length=1)

array (size=2)
'in' => array (size=3)
    0 => string '<p>a</p>' (length=8)
    1 => string '<p>b</p>' (length=8)
    2 => string '<p>c</p>' (length=8)
'out' => array (size=2)
    0 => string '1' (length=1)
    1 => string '2' (length=1)

array (size=2)
'in' => array (size=1)
    0 => string '<p></p>' (length=7)
'out' => array (size=1)
    0 => string '1<p>b' (length=5)

You can see a demo here: 3v4l.org/TQLmn

like image 1
Kim Steinhaug Avatar answered Oct 05 '22 02:10

Kim Steinhaug