In my Scala code, I am fetching a response from a server using the getInputStream
method of HttpUrlConnection
class. The response is XML data. However the data contains HTML entities like &
and '
.
Is there a way I can replace these characters with their text equivalent so that I can parse the XML properly?
When you use wizards to customize any string in your XML file, you can use the following special symbols: <, >, &, ', ". You can also use these symbols when you are editing a query in Expert Mode or when you are manually entering SQL code into XML files between CDATA tags.
Open an XML document in the text editing mode, right click inside it and there is a new menu item "Determine Complex Layout Chars".
The only illegal characters are & , < and > (as well as " or ' in attributes, depending on which character is used to delimit the attribute value: attr="must use " here, ' is allowed" and attr='must use ' here, " is allowed' ). They're escaped using XML entities, in this case you want & for & .
The ampersand character (&) starts entity markup (the first character of a character entity reference). > The greater-than character (>) ends a start-tag or an end-tag.
It's necessary to encode those entities in xml so they don't interfere with its syntax. The <
(<) and >
(>) entities make this more obvious. It would be impossible to parse XML whose content was littered with < and > symbols.
Scala's scala.xml package should give you the tools you need to parse your xml. Here's some guidance from the library's author.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With