Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Periodically refresh IMPORTXML() spreadsheet function

I have a large sheet with around 30 importxml functions that obtain data from a website that updates usually twice a day.

I would like to run the importxml function on a timely basis (every 8 hours) for my Google Spreadsheet to save the data in another sheet. The saving already works, however the updating does not!

I read in Google Spreadsheet row update that it might run every 2 hours, however I do not believe that this is true, because since I added it to my sheet nothing has changed or updated, when the spreadsheet is NOT opened.

How can I "trigger" the importxml function in my Google Spreadsheet in an easy way, as I have a lot of importxml functions in it?

like image 668
Carol.Kar Avatar asked Nov 23 '15 14:11

Carol.Kar


People also ask

How often does Importxml refresh Google Sheets?

Data Freshness To ensure users get fresh data while they keep their usage reasonable, IMPORTDATA, IMPORTHTML, and IMPORTXML share some rules: All three functions automatically check for updates every hour while the document is open, even if the formula and sheet don't change.


2 Answers

I made a couple of adjustments to Mogsdad's answer:

  • Fixed the releaseLock() call placement
  • Updates (or adds) a querystring parameter to the url in the import function (as opposed to storing, removing, waiting 5 seconds, and then restoring all relevant formulas)
  • Works on a specific sheet in your spreadsheet
  • Shows time of last update

...

function RefreshImports() {
  var lock = LockService.getScriptLock();
  if (!lock.tryLock(5000)) return;             // Wait up to 5s for previous refresh to end.

  var id = "[YOUR SPREADSHEET ID]";
  var ss = SpreadsheetApp.openById(id);
  var sheet = ss.getSheetByName("[SHEET NAME]");
  var dataRange = sheet.getDataRange();
  var formulas = dataRange.getFormulas();
  var content = "";
  var now = new Date();
  var time = now.getTime();
  var re = /.*[^a-z0-9]import(?:xml|data|feed|html|range)\(.*/gi;
  var re2 = /((\?|&)(update=[0-9]*))/gi;
  var re3 = /(",)/gi;

  for (var row=0; row<formulas.length; row++) {
    for (var col=0; col<formulas[0].length; col++) {
      content = formulas[row][col];
      if (content != "") {
        var match = content.search(re);
        if (match !== -1 ) {
          // import function is used in this cell
          var updatedContent = content.toString().replace(re2,"$2update=" + time);
          if (updatedContent == content) {
            // No querystring exists yet in url
            updatedContent = content.toString().replace(re3,"?update=" + time + "$1");
          }
          // Update url in formula with querystring param
          sheet.getRange(row+1, col+1).setFormula(updatedContent);
        }
      }
    }
  }

  // Done refresh; release the lock.
  lock.releaseLock();

  // Show last updated time on sheet somewhere
  sheet.getRange(7,2).setValue("Rates were last updated at " + now.toLocaleTimeString())
}
like image 73
Gerbus Avatar answered Sep 19 '22 09:09

Gerbus


The Google Spreadsheet row update question and its answers refer to the "Old Sheets", which had different behaviour than the 2015 version of Google Sheets does. There is no automatic refresh of content with "New Sheets"; changes are only evaluated now in response to edits.

While Sheets no longer provides this capability natively, we can use a script to refresh the "import" formulas (IMPORTXML, IMPORTDATA, IMPORTHTML and IMPORTANGE).

Utility script

For periodic refresh of IMPORT formulas, set this function up as a time-driven trigger.

Caveats:

  • Import function Formula changes made to the spreadsheet by other scripts or users during the refresh period COULD BE OVERWRITTEN.
  • Overlapping refreshes might make your spreadsheet unstable. To mitigate that, the utility script uses a ScriptLock. This may conflict with other uses of that lock in your script.

 

/**
 * Go through all sheets in a spreadsheet, identify and remove all spreadsheet
 * import functions, then replace them a while later. This causes a "refresh"
 * of the "import" functions. For periodic refresh of these formulas, set this
 * function up as a time-based trigger.
 *
 * Caution: Formula changes made to the spreadsheet by other scripts or users
 * during the refresh period COULD BE OVERWRITTEN.
 *
 * From: https://stackoverflow.com/a/33875957/1677912
 */
function RefreshImports() {
  var lock = LockService.getScriptLock();
  if (!lock.tryLock(5000)) return;             // Wait up to 5s for previous refresh to end.
  // At this point, we are holding the lock.

  var id = "YOUR-SHEET-ID";
  var ss = SpreadsheetApp.openById(id);
  var sheets = ss.getSheets();

  for (var sheetNum=0; sheetNum<sheets.length; sheetNum++) {
    var sheet = sheets[sheetNum];
    var dataRange = sheet.getDataRange();
    var formulas = dataRange.getFormulas();
    var tempFormulas = [];
    for (var row=0; row<formulas.length; row++) {
      for (col=0; col<formulas[0].length; col++) {
        // Blank all formulas containing any "import" function
        // See https://regex101.com/r/bE7fJ6/2
        var re = /.*[^a-z0-9]import(?:xml|data|feed|html|range)\(.*/gi;
        if (formulas[row][col].search(re) !== -1 ) {
          tempFormulas.push({row:row+1,
                             col:col+1,
                             formula:formulas[row][col]});
          sheet.getRange(row+1, col+1).setFormula("");
        }
      }
    }

    // After a pause, replace the import functions
    Utilities.sleep(5000);
    for (var i=0; i<tempFormulas.length; i++) {
      var cell = tempFormulas[i];
      sheet.getRange( cell.row, cell.col ).setFormula(cell.formula)
    }

    // Done refresh; release the lock.
    lock.releaseLock();
  }
}
like image 31
Mogsdad Avatar answered Sep 20 '22 09:09

Mogsdad