Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Decode multiple xml tags inside using PHP

I'm looking for a 'smart way' of decoding multiple XML tags inside a string, i have the following function:

function b($params) {
    $xmldata = '<?xml version="1.0" encoding="UTF-8" ?><root>' . html_entity_decode($params['data']) . '</root>';
    $lang = ucfirst(strtolower($params['lang']));
    if (simplexml_load_string($xmldata) === FALSE) {
        return $params['data'];
    } else {
        $langxmlobj = new SimpleXMLElement($xmldata);

        if ($langxmlobj -> $lang) {
            return $langxmlobj -> $lang;
        } else {
            return $params['data'];
        }
    }
}

And trying out

$params['data'] = '<French>Service DNS</French><English>DNS Service</English> - <French>DNS Gratuit</French><English>Free DNS</English>';
$params['lang'] = 'French';
$a = b($params);
print_r($a);

But outputs:

Service DNS

And I want it to basically output every tags, so result should be :

Service DNS - DNS Gratuit

Pulling my hairs out. Any quick help or directions would be appreciated.


Edit: Refine needs.

Seems that I wasn't clear enough; so let me show another example

If i have the following string as input :

The <French>Chat</French><English>Cat</English> is very happy to stay on stackoverflow 
because it makes him <French>Heureux</French><English>Happy</English> to know that it 
is the best <French>Endroit</French><English>Place</English> to find good people with
good <French>Réponses</French><English>Answers</English>.

So if i'd run function with 'French' it will return :

The Chat is very happy to stay on stackoverflow 
because it makes him Heureux to know that it 
is the best Endroit to find good people with
good Réponses.

And with 'English' :

The Cat is very happy to stay on stackoverflow 
because it makes him Happy to know that it 
is the best Place to find good people with
good Answers.

Hope it's more clear now.

like image 930
Disco Avatar asked Dec 12 '13 12:12

Disco


3 Answers

Basically, I will parse out the lang section firstly, like:

<French>Chat</French><English>Cat</English>

with this:

"@(<($defLangs)>.*?</\\2>)+@i"

Then parse the right lang str out with callback.

If you got php 5.3+, then:

function transLang($str, $lang, $defLangs = 'French|English')
{
    return preg_replace_callback ( "@(<($defLangs)>.*?</\\2>)+@i", 

            function ($matches) use($lang)
            {
                preg_match ( "/<$lang>(.*?)<\/$lang>/i", $matches [0], $longSec );

                return $longSec [1];
            }, $str );
}

echo transLang ( $str, 'French' ), "\n", transLang ( $str, 'English' );

If not, a little complicated:

class LangHelper
{

    private $lang;

    function __construct($lang)
    {
        $this->lang = $lang;
    }

    public function callback($matches)
    {
        $lang = $this->lang;

        preg_match ( "/<$lang>(.*?)<\/$lang>/i", $matches [0], $subMatches );

        return $subMatches [1];
    }

}

function transLang($str, $lang, $defLangs = 'French|English')
{
    $langHelper = new LangHelper ( $lang );

    return preg_replace_callback ( "@(<($defLangs)>.*?</\\2>)+@i", 
            array (
                    $langHelper,
                    'callback' 
            ), $str );
}

echo transLang ( $str, 'French' ), "\n", transLang ( $str, 'English' );
like image 190
Andrew Avatar answered Nov 13 '22 04:11

Andrew


If I understand you correctly you would like to remove all "language" tags, but keep the contents of the provided language.

The DOM is a tree of nodes. Tags are element nodes, the text is stored in text nodes. Xpath allows to select nodes using expressions. So take all the child nodes of the language elements you want to keep and copy them just before the language node. Then remove all language nodes. This will work even if the language elements contain other element nodes, like an <em>.

function replaceLanguageTags($fragment, $language) {
  $dom = new DOMDocument();
  $dom->loadXml(
    '<?xml version="1.0" encoding="UTF-8" ?><content>'.$fragment.'</content>'
  );
  // get an xpath object
  $xpath = new DOMXpath($dom);

  // fetch all nodes with the language you like to keep
  $nodes = $xpath->evaluate('//'.$language);
  foreach ($nodes as $node) {
    // copy all the child nodes of just before the found node
    foreach ($node->childNodes as $childNode) {
      $node->parentNode->insertBefore($childNode->cloneNode(TRUE), $node);
    }
    // remove the found node
    $node->parentNode->removeChild($node);
  }

  // select all language nodes
  $tags = array('English', 'French');
  $nodes = $xpath->evaluate('//'.implode('|//', $tags));
  foreach ($nodes as $node) {
    // remove them
    $node->parentNode->removeChild($node);
  }

  $result = '';
  // we do not need the root node, so save all its children
  foreach ($dom->documentElement->childNodes as $node) {
    $result .= $dom->saveXml($node);
  }
  return $result;
}

$xml = <<<'XML'
The <French>Chat</French><English>Cat</English> is very happy to stay on stackoverflow
because it makes him <French>Heureux</French><English>Happy</English> to know that it
is the best <French>Endroit</French><English>Place</English> to find good people with
good <French>Réponses</French><English>Answers</English>.
XML;

var_dump(replaceLanguageTags($xml, 'English'));
var_dump(replaceLanguageTags($xml, 'French'));

Output:

string(146) "The Cat is very happy to stay on stackoverflow
because it makes him Happy to know that it
is the best Place to find good people with
good Answers."
string(153) "The Chat is very happy to stay on stackoverflow
because it makes him Heureux to know that it
is the best Endroit to find good people with
good Réponses."
like image 3
ThW Avatar answered Nov 13 '22 05:11

ThW


What version of PHP are you on? I don't know what else could be different, but I copied & pasted your code and got the following output:

SimpleXMLElement Object
(
    [0] => Service DNS
    [1] => DNS Gratuit
)

Just to be sure, this is the code I copied from above:

<?php

function b($params) {
    $xmldata = '<?xml version="1.0" encoding="UTF-8" ?><root>' . html_entity_decode($params['data']) . '</root>';
    $lang = ucfirst(strtolower($params['lang']));
    if (simplexml_load_string($xmldata) === FALSE) {
        return $params['data'];
    } else {
        $langxmlobj = new SimpleXMLElement($xmldata);

        if ($langxmlobj -> $lang) {
            return $langxmlobj -> $lang;
        } else {
            return $params['data'];
        }
    }
}

$params['data'] = '<French>Service DNS</French><English>DNS Service</English> - <French>DNS Gratuit</French><English>Free DNS</English>';
$params['lang'] = 'French';
$a = b($params);
print_r($a);
like image 2
Joe T Avatar answered Nov 13 '22 04:11

Joe T