Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamically reading raw XML elements as text in Java

Tags:

java

xml

stax

Assuming an XML file with unknown structure (i.e., unknown element and attribute names), like

<RootElement>
   <Level 1 ...>
        <Level 2 ...>
            ...
        </Level 2>
        <Level 2 ...>
            ...
        </Level 2>
    </Level 1>
    <Level 1 ...>
        <Level 2 ...>
            ...
        </Level 2>
        <Level 2 ...>
            ...
        </Level 2>
    </Level 1>
</RootElement>

Is there any way using StAX to get the full raw text of each element?

At least, how can this be done for the first level, i.e. in the above example (ignoring pretty printing) how can we read the following 2 strings in a Java String variable:

"<Level 1 ...><Level 2...>...</Level 2></Level 1>"

and

"<Level 1 ...><Level 2...>...</Level 2></Level 1>"
like image 594
PNS Avatar asked Mar 27 '26 11:03

PNS


1 Answers

Use an XMLStreamReader and XMLStreamWriter together to get (producee) whatever raw XML you want to. It might seem like you can do some tricks for a more simple solution, but you can't - the XML needs to be parsed or else you are in deep water, and if you'd like to hack a parser, they are usually implemented with internal buffering which makes it a bit of hairy work to correctly cut up an incoming stream.

Edit:Use the parsing pattern in this question to keep track of the level. To write, handle each event type from the input in its own way - note that you can iterator over all the attributes and also namespaces for start element events.

like image 171
ThomasRS Avatar answered Mar 30 '26 00:03

ThomasRS