Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid encoding of <,>,& with Document.createTextNode

Tags:

java

xml

class XMLencode 
{
  public static void main(String[] args) 
  {
    try{

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder docBuilder = factory.newDocumentBuilder();
    Document doc = docBuilder.newDocument();
    Element root = doc.createElement("roseindia");
       doc.appendChild(root);
    Text elmnt=doc.createTextNode("<data>sun</data><abcdefg/><end/>");
       root.appendChild(elmnt);
     TransformerFactory tranFactory = TransformerFactory.newInstance(); 
    Transformer aTransformer = tranFactory.newTransformer(); 
    Source src = new DOMSource(doc); 
    Result dest = new StreamResult(System.out); 
    aTransformer.transform(src, dest); 

    }catch(Exception e){
     System.out.println(e.getMessage());
         }
     }
}

Here is my above piece of code. The output generated is like this

<?xml version="1.0" encoding="UTF-8" standalone="no"?><roseindia>&lt;data&gt;sun&lt;/data&gt;&lt;abcdefg/&gt;&lt;end/&gt;</roseindia>

I dont want the tags to be encoded. I need the output in this fashion.

<?xml version="1.0" encoding="UTF-8" standalone="no"?><roseindia><data>sun</data><abcdefg/><end/></roseindia>

Please help me on this.

Thanks, Mohan

like image 270
user1686082 Avatar asked Sep 21 '12 06:09

user1686082


People also ask

How do I copy URL without encoding?

No more percent-encoding, no more punycode. Use Alt+U shortcut or click the icon to copy URL from address bar.

How do I stop URL decoding?

A second way to prevent the browser from URL encoding the input is to use the enctype=”text/plain” tag and to submit the form as a POST.

Does browser automatically encode URL?

Browsers automatically encode the URL i.e. it converts some special characters to other reserved characters and then makes the request. For eg: Space character ” ” is either converted to + or %20.

What happens if you double encode a URL?

By using double encoding it's possible to bypass security filters that only decode user input once. The second decoding process is executed by the backend platform or modules that properly handle encoded data, but don't have the corresponding security checks in place.


2 Answers

Short Answer

You could leverage the CDATA mechanism in XML to prevent characters from being escaped. Below is an example of the DOM code:

doc.createCDATASection("<foo/>");

The content will be:

<![CDATA[<foo/>]]>

LONG ANSWER

Below is a complete example of leveraging a CDATA section using the DOM APIs.

package forum12525152;

import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.*;

public class Demo {

    public static void main(String[] args) throws Exception {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document document = db.newDocument();

        Element rootElement = document.createElement("root");
        document.appendChild(rootElement);

        // Create Element with a Text Node
        Element fooElement = document.createElement("foo");
        fooElement.setTextContent("<foo/>");
        rootElement.appendChild(fooElement);

        // Create Element with a CDATA Section
        Element barElement = document.createElement("bar");
        CDATASection cdata = document.createCDATASection("<bar/>");
        barElement.appendChild(cdata);
        rootElement.appendChild(barElement);

        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer t = tf.newTransformer();
        DOMSource source = new DOMSource(document);
        StreamResult result = new StreamResult(System.out);
        t.transform(source, result);
    }

}

Output

Note the difference in the foo and bar elements even though they have similar content. I have formatted the result of running the demo code to make it more readable:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root>
    <foo>&lt;foo/&gt;</foo>
    <bar><![CDATA[<bar/>]]></bar>
</root>
like image 77
bdoughan Avatar answered Sep 30 '22 12:09

bdoughan


Instead of writing like this doc.createTextNode("<data>sun</data><abcdefg/><end/>");

You should create each element.

import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import org.w3c.dom.*;
class XMLencode {
    public static void main(String[] args) {
        try {

            DocumentBuilderFactory factory = DocumentBuilderFactory
                    .newInstance();
            DocumentBuilder docBuilder = factory.newDocumentBuilder();
            Document doc = docBuilder.newDocument();
            Element root = doc.createElement("roseindia");
            doc.appendChild(root);

            Element data = doc.createElement("data");
            root.appendChild(data);
            Text elemnt = doc.createTextNode("sun");
            data.appendChild(elemnt);
            Element data1 = doc.createElement("abcdefg");
            root.appendChild(data1);

            //Text elmnt = doc.createTextNode("<data>sun</data><abcdefg/><end/>");
            //root.appendChild(elmnt);

            TransformerFactory tranFactory = TransformerFactory.newInstance();
            Transformer aTransformer = tranFactory.newTransformer();
            Source src = new DOMSource(doc);
            Result dest = new StreamResult(System.out);
            aTransformer.transform(src, dest);

        } catch (Exception e) {
            System.out.println(e.getMessage());
        }
    }
}
like image 39
swemon Avatar answered Sep 30 '22 11:09

swemon