Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pugixml - get all text nodes (PCDATA), not just the first

Tags:

c++

xml

pugixml

Currently, if I try to parse

<parent>
    First bit of text
    <child>
    </child>
    Second bit of text
</parent>

I only get First bit of text with

parent.text().get()

What's the correct way to grab all text nodes in parent?

  1. Is there a nice utility function for this?
  2. How could it be done iterating though all children?
like image 975
jozxyqk Avatar asked Feb 15 '14 07:02

jozxyqk


1 Answers

There is no function that concatenates all text; if you want to get a list of text node children, you have two options:

  1. XPath query:

     pugi::xpath_node_set ns = parent.select_nodes("text()");
    
     for (size_t i = 0; i < ns.size(); ++i)
         std::cout << ns[i].node().value() << std::endl;
    
  2. Manual iteration w/type checking:

     for (pugi::xml_node child = parent.first_child(); child; child = child.next_sibling())
         if (child.type() == pugi::node_pcdata)
             std::cout << child.value() << std::endl;
    

Note that if you can use C++11 then the second option can be much more concise:

for (pugi::xml_node child: parent.children())
    if (child.type() == pugi::node_pcdata)
        std::cout << child.value() << std::endl;

(of course, you can also use ranged for to iterate through xpath_node_set)

like image 61
zeuxcg Avatar answered Sep 22 '22 17:09

zeuxcg