I need to parse an XML string with MATLAB (caution: without file I/O, so I don't want to write the string to a file and then read them). I'm receiving the strings from an HTTP connection and the parsing should be very fast. I'm mostly concerned about reading the values of certain tags in the entire string
The net is full of death threats about parsing XML with regexp so I didn't want to get into that just yet. I know MATLAB has seamless java integration but I'm not very java savvy. Is there a quick way to get certain values from XML very very rapidly?
For example I want to get the 'volume' information from this string below and write this to a variable.
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<root>
<volume>256</volume>
<length>0</length>
<time>0</time>
<state>stop</state>
....
Read XML File into MATLAB® Structure ArrayCreate a parsing function to read an XML file into a MATLAB® structure, and then read a sample XML file into the MATLAB workspace. To create the function parseXML , copy and paste this code into an m-file parseXML. m .
What is XML? The Extensible Markup Language (XML) is a simple text-based format for representing structured information: documents, data, configuration, books, transactions, invoices, and much more. It was derived from an older standard format called SGML (ISO 8879), in order to be more suitable for Web use.
To summarize: An XML file is a file used to store data in the form of hierarchical elements. Data stored in XML files can be read by computer programs with the help of custom tags, which indicate the type of element.
There's an entire class of functions for dealing with xml, including xmlread
and xmlwrite
. Those should be pretty useful for your problem.
For what it's worth, below is the Matlab executable Java code to perform the required task, without writing to an intermediate file:
%An XML formatted string
strXml = [...
'<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>' char(10)...
'<root>' char(10) ...
' <volume>256</volume>' char(10) ...
' <length>0</length>' char(10) ...
' <time>0</time>' char(10) ...
' <state>stop</state>' char(10) ...
'</root>' ];
%"simple" java code to create a document from said string
xmlDocument = javax.xml.parsers.DocumentBuilderFactory.newInstance().newDocumentBuilder.parse(java.io.StringBufferInputStream(strXml));
%"intuitive" methods to explore the xmlDocument
nodeList = xmlDocument.getElementsByTagName('volume');
numberOfNodes = nodeList.getLength();
firstNode = nodeList.item(0);
firstNodeContent = firstNode.getTextContent;
disp(firstNodeContent); %Returns '256'
As an alternative, if your application allows it, consider passing the URL directly into your XML parser. Untested java code is below, but that probably also opens up the Matlab built-in xslt
function as well.
xmlDocument = javax.xml.parsers.DocumentBuilderFactory.newInstance().newDocumentBuilder.parse('URL_AS_A_STRING_HERE');
Documentation here. Start at the "javax.xml.parsers" package.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With