Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to unescape XML in java

Tags:

java

xml

escaping

I need to unescape a xml string containing escaped XML tags:

< > & etc... 

I did find some libs that can perform this task, but i'd rather use a single method that can perform this task.

Can someone help?

cheers, Bas Hendriks

like image 267
Bas Hendriks Avatar asked May 14 '10 12:05

Bas Hendriks


People also ask

How do you escape an XML tag in Java?

In Java, we could always write our own functions to escape XML special characters with its equivalent String literals, but we could also use the Java library “StringEscapeUtils” provided by Apache Commons. This library provides us with a common API that does the XML escaping for us.

Can we convert string to XML in Java?

Document convertStringToDocument(String xmlStr) : This method will take input as String and then convert it to DOM Document and return it. We will use InputSource and StringReader for this conversion.


2 Answers

StringEscapeUtils.unescapeXml(xml) 

(commons-lang, download)

like image 115
Bozho Avatar answered Oct 02 '22 21:10

Bozho


Here's a simple method to unescape XML. It handles the predefined XML entities and decimal numerical entities (&#nnnn;). Modifying it to handle hex entities (&#xhhhh;) should be simple.

public static String unescapeXML( final String xml ) {     Pattern xmlEntityRegex = Pattern.compile( "&(#?)([^;]+);" );     //Unfortunately, Matcher requires a StringBuffer instead of a StringBuilder     StringBuffer unescapedOutput = new StringBuffer( xml.length() );      Matcher m = xmlEntityRegex.matcher( xml );     Map<String,String> builtinEntities = null;     String entity;     String hashmark;     String ent;     int code;     while ( m.find() ) {         ent = m.group(2);         hashmark = m.group(1);         if ( (hashmark != null) && (hashmark.length() > 0) ) {             code = Integer.parseInt( ent );             entity = Character.toString( (char) code );         } else {             //must be a non-numerical entity             if ( builtinEntities == null ) {                 builtinEntities = buildBuiltinXMLEntityMap();             }             entity = builtinEntities.get( ent );             if ( entity == null ) {                 //not a known entity - ignore it                 entity = "&" + ent + ';';             }         }         m.appendReplacement( unescapedOutput, entity );     }     m.appendTail( unescapedOutput );      return unescapedOutput.toString(); }  private static Map<String,String> buildBuiltinXMLEntityMap() {     Map<String,String> entities = new HashMap<String,String>(10);     entities.put( "lt", "<" );     entities.put( "gt", ">" );     entities.put( "amp", "&" );     entities.put( "apos", "'" );     entities.put( "quot", "\"" );     return entities; } 
like image 43
texclayton Avatar answered Oct 02 '22 21:10

texclayton