Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fetch and parse an XML file using AppleScript?

Tags:

applescript

There's an XML file on some remote server (http://foo/bar.xml):

<?xml version="1.0" encoding="UTF-8"?> 
<foo> 
  bar
</foo>

How can I get the value "bar" using AppleScript?

like image 479
Roberto Aloi Avatar asked Sep 02 '11 07:09

Roberto Aloi


2 Answers

Here's what I've done:

set file_tgt to (POSIX path of (path to temporary items)) & "file.xml"
    do shell script "curl -L " & "http://url.com/file.xml" & " -o " & file_tgt
tell application "System Events"
    set file_content to contents of XML file file_tgt
    tell file_content
        set my_value to value of XML element 1
    end tell
end tell

Originally I was using the "URL Access Scripting" app to fetch the file, but since it has been removed in Lion, I switched to pure curl, which works under both Snow Leopard and Lion.

like image 200
Roberto Aloi Avatar answered Nov 09 '22 01:11

Roberto Aloi


I found this thread which has an example of parsing an XML file with the XML tools available via System Events. Seems pretty convoluted to me though.

There's also this (freeware) scripting addition package for parsing/writing XML. Haven't looked at it, but it might be neat.

Personally, I would save my script as a script bundle, and then I'd make a little php/Ruby/perl/python/whatever script to parse the XML (since I'm just more comfortable with that) in the bundle. Then I'd use AppleScript then pass the XML to the parser script from cURL.

AppleScript:

set scriptPath to POSIX path of (path to me as alias) & "Contents/Resources/parse_xml.rb"
set fooValue to do shell script "curl http://foo/test.xml 2> /dev/null | ruby " & quoted form of scriptPath

parse_xml.rb could be somthing like this (using Ruby as an example):

require "rexml/document"

# load and parse the xml from stdin
xml = STDIN.read
doc = REXML::Document.new(xml)

# output the text of the root element (<foo>) stripped of leading/trailing whitespace
puts doc.root.text.strip

(Ruby and the REXML package should be readily available on any Mac, so it should work anywhere… I believe)

Point is, when the script runs it'll download the XML file with cURL, pass it to the Ruby script, and in the end, fooValue in the AppleScript will be set to "bar".

Of course, if the XML is more complex, you'll need more scripting, or take another look at the other options.

There are probably even more ways of doing it (for instance, you could just do some string manipulation instead of full-on XML parsing, but that's a little brittle of course), but I'll stop here :)

like image 44
Flambino Avatar answered Nov 09 '22 00:11

Flambino