Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple wildcard preg_match_all php

I want to extract a number from html, between <td>...</td>. I have tryed to following code:

$views = "/<td id=\"adv-result-views-(?:.*)\" class=\"spec\">(.*?)<\/td>/";

after -views- is a random number. What is the right code for ignoring the random number in the search?

like image 738
user3625376 Avatar asked Nov 01 '22 00:11

user3625376


2 Answers

Using a DOM will be the right way..

Proceed this way...

<?php
$htm = '<td id="adv-result-views-190147977" class="spec"> 4 </td>';
$dom = new DOMDocument;
$dom->loadHTML($htm);
echo $content = $dom->getElementsByTagName('td')->item(0)->nodeValue; //4
like image 70
Shankar Narayana Damodaran Avatar answered Nov 09 '22 14:11

Shankar Narayana Damodaran


$html = '<td id="adv-result-views-190147977" class="spec"> 4 </td>';

// get the value of element
echo trim( strip_tags( $html ) );

// get the number in id attribute, replace string with group capture $1
echo preg_replace( '/^.*?id="[\pLl-]+(\d+).*$/s', '$1', $html );   
/*
    ^.*?id="            Any character from the beginning of string, not gready
        id="            Find 'id="'
            [\pLl-]+    Lower case letter and '-' ( 1 or more times )
            (\d+)       Group and capture to \1 -> digits (0-9) (1 or more times) -> end of \1                      
    .*$                 Any character, gready, until end of the string
*/

// get html withut the number in id attribute
echo preg_replace( '/(^.*?id="[\pLl-]+)(\d+)(.*$)/s', '$1$3', $html );

This is a regex example since the question is tagged as such, but DOM is the preferred way ( especially in SO community ) for parsing html.

like image 22
Danijel Avatar answered Nov 09 '22 12:11

Danijel