Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is faster: ScriptDb or SpreadsheetApp?

Let's say I have a a script that iterates over a list of 400 objects. Each object has anywhere from 1 to 10 properties. Each property is a reasonable size string or a somewhat large integer.

Is there a significant difference in performance of saving these objects into ScriptDB vs saving them into Spreadsheet(w/o doing it in one bulk operation).

like image 906
Anton Soradoi Avatar asked Feb 28 '13 21:02

Anton Soradoi


1 Answers

Executive Summary

Yes, there is a significant difference! Huge! And I have to admit that this experiment didn't turn out the way I expected.

With this amount of data, writing to a spreadsheet was always much faster than using ScriptDB.

These experiments support the assertions regarding bulk operations in the Google Apps Script Best Practices. Saving data in a spreadsheet using a single setValues() call was 75% faster than line-by-line, and two orders of magnitude faster than cell-by-cell.

On the other hand, recommendations to use Spreadsheet.flush() should be considered carefully, due to the performance impact. In these experiments, a single write of a 4000-cell spreadsheet took less than 50ms, and adding a call to flush() increased that to 610ms - still less than a second, but an order of magnitude tax seems ludicrous. Calling flush() for each of the 400 rows in the sample spreadsheet made the operation take almost 12 seconds, when it took just 164 ms without it. If you've been experiencing Exceeded maximum execution time errors, you may benefit from both optimizing your code AND removing calls to flush().

Experimental Results

All timings were derived following the technique described in How to measure time taken by a function to execute. Times are expressed in milliseconds.

Here are the results from a single pass of five different approaches, two using ScriptDB, three writing to Spreadsheets, all with the same source data. (400 objects with 5 String & 5 Number attributes)

Experiment 1

  • Elapsed time for ScriptDB/Object test: 53529
  • Elapsed time for ScriptDB/Batch test: 37700
  • Elapsed time for Spreadsheet/Object test: 145
  • Elapsed time for Spreadsheet/Attribute test: 4045
  • Elapsed time for Spreadsheet/Bulk test: 32

Effect of Spreadsheet.flush()

Experiment 2

In this experiment, the only difference from Experiment 1 was that we called Spreadsheet.flush() after every setValue/s call. The cost of doing so is dramatic, (around 700%) but does not change the recommendation to use a spreadsheet over ScriptDB for speed reasons, because writing to spreadsheets is still faster.

  • Elapsed time for ScriptDB/Object test: 55282
  • Elapsed time for ScriptDB/Batch test: 37370
  • Elapsed time for Spreadsheet/Object test: 11888
  • Elapsed time for Spreadsheet/Attribute test: 117388
  • Elapsed time for Spreadsheet/Bulk test: 610

Note: This experiment was often killed with Exceeded maximum execution time.

Caveat Emptor

You're reading this on the interwebs, so it must be true! But take it with a grain of salt.

  • These are results from very small sample sizes, and may not be completely reproducible.
  • These results are measuring something that changes constantly - while they were observed on Feb 28 2013, the system they measured could be completely different when you read this.
  • The efficiency of these operations is affected by many factors that are not controlled in these experiments; caching of instructions & intermediate results and server load, for example.
  • Maybe, just maybe, someone at Google will read this, and improve the efficiency of ScriptDB!

The Code

If you want to perform (or better yet, improve) these experiments, create a blank spreadsheet, and copy this into a new script within it. This is also available as a gist.

/**
 * Run experiments to measure speed of various approaches to saving data in
 * Google App Script (GAS).
 */
function testSpeed() {
  var numObj = 400;
  var numAttr = 10;
  var doFlush = false;  // Set true to activate calls to SpreadsheetApp.flush()

  var arr = buildArray(numObj,numAttr);
  var start, stop;  // time catchers
  var db = ScriptDb.getMyDb();
  var sheet;

  // Save into ScriptDB, Object at a time
  deleteAll(); // Clear ScriptDB
  start = new Date().getTime();
    for (var i=1; i<=numObj; i++) {
      db.save({type: "myObj", data:arr[i]});
    }
  stop = new Date().getTime();
  Logger.log("Elapsed time for ScriptDB/Object test: " + (stop - start));

  // Save into ScriptDB, Batch
  var items = [];
  // Restructure data - this is done outside the timed loop, assuming that
  // the data would not be in an array if we were using this approach.
  for (var obj=1; obj<=numObj; obj++) {
    var thisObj = new Object();
    for (var attr=0; attr < numAttr; attr++) {
      thisObj[arr[0][attr]] = arr[obj][attr];
    }
    items.push(thisObj);
  }
  deleteAll(); // Clear ScriptDB
  start = new Date().getTime();
    db.saveBatch(items, false);
  stop = new Date().getTime();
  Logger.log("Elapsed time for ScriptDB/Batch test: " + (stop - start));

  // Save into Spreadsheet, Object at a time
  sheet = SpreadsheetApp.getActive().getActiveSheet().clear();
  start = new Date().getTime();
    for (var row=0; row<=numObj; row++) {
      var values = [];
      values.push(arr[row]);
      sheet.getRange(row+1, 1, 1, numAttr).setValues(values);
      if (doFlush) SpreadsheetApp.flush();
    }
  stop = new Date().getTime();
  Logger.log("Elapsed time for Spreadsheet/Object test: " + (stop - start));

  // Save into Spreadsheet, Attribute at a time
  sheet = SpreadsheetApp.getActive().getActiveSheet().clear();
  start = new Date().getTime();
    for (var row=0; row<=numObj; row++) {
      for (var cell=0; cell<numAttr; cell++) {
        sheet.getRange(row+1, cell+1, 1, 1).setValue(arr[row][cell]);
        if (doFlush) SpreadsheetApp.flush();
      }
    }
  stop = new Date().getTime();
  Logger.log("Elapsed time for Spreadsheet/Attribute test: " + (stop - start));

  // Save into Spreadsheet, Bulk
  sheet = SpreadsheetApp.getActive().getActiveSheet().clear();
  start = new Date().getTime();
    sheet.getRange(1, 1, numObj+1, numAttr).setValues(arr);
    if (doFlush) SpreadsheetApp.flush();
  stop = new Date().getTime();
  Logger.log("Elapsed time for Spreadsheet/Bulk test: " + (stop - start));
}

/**
 * Create a two-dimensional array populated with 'numObj' rows of 'numAttr' cells.
 */
function buildArray(numObj,numAttr) {
  numObj = numObj | 400;
  numAttr = numAttr | 10;
  var array = [];
  for (var obj = 0; obj <= numObj; obj++) {
    array[obj] = [];
    for (var attr = 0; attr < numAttr; attr++) {
      var value;
      if (obj == 0) {
        // Define attribute names / column headers
        value = "Attr"+attr;
      }
      else {
        value = ((attr % 2) == 0) ? "This is a reasonable sized string for testing purposes, not too long, not too short." : Number.MAX_VALUE;
      }
      array[obj].push(value);
    }
  }
  return array
}

function deleteAll() {
  var db = ScriptDb.getMyDb();
  while (true) {
    var result = db.query({}); // get everything, up to limit
    if (result.getSize() == 0) {
      break;
    }
    while (result.hasNext()) {
      var item = result.next()
      db.remove(item);
    }
  }
}
like image 59
Mogsdad Avatar answered Oct 30 '22 04:10

Mogsdad