Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse XML using Google Apps Script

I'm having difficulty parsing the XML from a boardgamegeek query so that I may populate a google sheet with the data. Here's an example of the bgg xml:

<boardgames termsofuse="http://boardgamegeek.com/xmlapi/termsofuse">
  <boardgame objectid="423">
    <yearpublished>1995</yearpublished>
    <minplayers>3</minplayers>
    <maxplayers>6</maxplayers>
    <playingtime>300</playingtime>
    <name primary="true" sortindex="1">1856</name>
  </boardgame>
</boardgames>

And here's the Google Apps Script I've written to parse it:

//get the data from boardgamegeek
  var url = 'http://www.boardgamegeek.com/xmlapi/boardgame/' + bggCode;
  var bggXml = UrlFetchApp.fetch(url).getContentText();

  var document = XmlService.parse(bggXml);
  var root = document.getRootElement();     
  var entries = new Array();
  entries = root.getChildren('boardgame');

  for (var x = 0; x < entries.length; i++) {
    var name = entries[x].getAttribute('name').getValue();
    var yearpublished = entries[x].getAttribute('yearpublished').getValue();
    var minplayers = entries[x].getAttribute('minplayers').getValue();
    var maxplayers = entries[x].getAttribute('maxplayers').getValue();
  }
  //SpreadsheetApp.getActiveSheet().getRange(i+1,7).setValue(yearpublished);
  Logger.log(entries);

I'm currently getting an error in the for-loop caused by entries being NULL. If I comment the loop out and log what bggXml looks like, it looks just like the example above. However, logging variables further down I get the following:

document => [Document:  No DOCTYPE declaration, Root is [Element: <boardgames/>]]
root => [Element: <boardgames/>]
entries =>  [[Element: <boardgame/>]]
entries[2] => undefined

Since the bggXml looks exactly how I'd expect but document does not, I assume the problem is in the parsing?

like image 276
pr0t0 Avatar asked Oct 23 '14 15:10

pr0t0


1 Answers

After much trial and error and stumbling through the dark, I found the solution I was looking for. This will get the value of an individual xml element and set it to a variable:

var yearpublished = root.getChild('boardgame').getChild('yearpublished').getText();

So my final code looks like this. I hope it helps you in your endeavors.

//get the data from boardgamegeek
  var url = 'http://www.boardgamegeek.com/xmlapi/boardgame/' + bggCode;
  var bggXml = UrlFetchApp.fetch(url).getContentText();

  var document = XmlService.parse(bggXml);
  var root = document.getRootElement();

  //set variables to data from bgg
  var yearpublished = root.getChild('boardgame').getChild('yearpublished').getText();
  var minplayers = root.getChild('boardgame').getChild('minplayers').getText();
  var maxplayers = root.getChild('boardgame').getChild('maxplayers').getText();
  var playingtime = root.getChild('boardgame').getChild('playingtime').getText();
  var name = root.getChild('boardgame').getChild('name').getText();

  //populate sheet with variable data
  SpreadsheetApp.getActiveSheet().getRange(i+1,1).setValue(name);
  SpreadsheetApp.getActiveSheet().getRange(i+1,4).setValue(minplayers);
  SpreadsheetApp.getActiveSheet().getRange(i+1,5).setValue(maxplayers);
  SpreadsheetApp.getActiveSheet().getRange(i+1,5).setValue(playingtime);
  SpreadsheetApp.getActiveSheet().getRange(i+1,7).setValue(yearpublished);

In case you happen to also be querying BGG, there are multiple name elements. I want the one with the primary attribute set to "true". Iterating through those elements to find the correct one will be my next challenge.

like image 145
pr0t0 Avatar answered Sep 24 '22 15:09

pr0t0