I'm having problems looping over an XML file about 20-30 MB (650000 rows).
This is my meta-code:
<cffile action="READ" ile="file.xml" variable="usersRaw">
<cfset usersXML = XmlParse(usersRaw)>
<cfset advsXML = XmlSearch(usersXML, "/advs/advuser")>
<cfset users = XmlSearch(usersXML, "/advs/advuser/user")>
<cfset numUsers = ArrayLen(users)>
<cfloop index="i" from="1" to="#numUsers#">
... some selects...
... insert...
<cfset advs = annunciXml[i]["vehicle"]>
<cfset numAdvs = ArrayLen(advs)>
<cfloop index="k" from="1" to="#numAdvs#">
... insert... or ... update...
</cfloop>
</cfloop>
struct of xml file is (yes, is not very good :-)
<advs>
<advuser>
<user>
</user>
<vehicle>
<vehicle>
</advuser>
</advs>
After ~120,000 rows I get an error: "Out of memory".
How can I improve performance of my script?
How can I diagnose where there is max memory consumption?
This utility creates a loop that detects the size of the input and parses it in smaller segments, appending the segments together to create a fully-parsed result. If not specified, a default value of 'loops' is used. If not specified, a default value of 'loop' is used.
If you want to open an XML file and edit it, you can use a text editor. You can use default text editors, which come with your computer, like Notepad on Windows or TextEdit on Mac. All you have to do is locate the XML file, right-click the XML file, and select the "Open With" option.
Symptoms. Even though the maximum file size is set to 100 MB, it is still possible to import an XML file larger than 100 MB via P6 Professional.
@SamG is correct that ColdFusion XML parsing can't do it because of the DOM parser, but SAX is painful, instead use a StAX parser, which provides a much simpler iterator interface. See the answer to another question I provided for an example of how to do this with ColdFusion.
This is roughly what you'd do for your example:
<cfset fis = createObject("java", "java.io.FileInputStream").init(
"#getDirectoryFromPath(getCurrentTemplatePath())#/file.xml"
)>
<cfset bis = createObject("java", "java.io.BufferedInputStream").init(fis)>
<cfset XMLInputFactory = createObject("java", "javax.xml.stream.XMLInputFactory").newInstance()>
<cfset reader = XMLInputFactory.createXMLStreamReader(bis)>
<cfloop condition="#reader.hasNext()#">
<cfset event = reader.next()>
<cfif event EQ reader.START_ELEMENT>
<cfswitch expression="#reader.getLocalName()#">
<cfcase value="advs">
<!--- root node, do nothing --->
</cfcase>
<cfcase value="advuser">
<!--- set values used later on for inserts, selects, updates --->
</cfcase>
<cfcase value="user">
<!--- some selects and insert --->
</cfcase>
<cfcase value="vehicle">
<!--- insert or update --->
</cfcase>
</cfswitch>
</cfif>
</cfloop>
<cfset reader.close()>
orangepips provides a reasonable solution. Please take a look at Ben Nadel's solution for handling very large XML files in ColdFusion. I have tested his approach on a 50MB XML file with 1.2 million lines. Ben uses a similar approach that orangepips provides here -- stream it using Java, then XMLParse each node in ColdFusion to get to the goods. Check it out -- like most of Ben Nadel's code and tutorials, it just works.
http://www.bennadel.com/blog/1345-Ask-Ben-Parsing-Very-Large-XML-Documents-In-ColdFusion.htm
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With