Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct usage of DriveApp.continueFileIterator(continuationToken)

I've written a script to iterate through a large number of files in a Google Drive folder. Due to the processing I am doing on those files it exceeds the maximum execution time. Naturally I wrote into the script to use DriveApp.continueFileIterator(continuationToken): the token gets stored in the Project Properties and when the script runs it checks to see if there's a token, if there is it creates the FileIterator from the token if not it starts afresh.

What have I found is even though the script restarts with the continuation token it still starts from the beginning of the iteration, trying to process the same files again which wastes time for the subsequent executions. Have I missed something vital as in a command or method to make it start from where it left off? Am I supposed to update the continuation token at various stages thoughout the while(contents.hasNext()) loop?

Here's the sample code slimmed down to give you an idea:

function listFilesInFolder() {
  var id= '0fOlDeRiDg';
  var scriptProperties = PropertiesService.getScriptProperties();
  var continuationToken = scriptProperties.getProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN');
  var lastExecution = scriptProperties.getProperty('LAST_EXECUTION');
  if (continuationToken == null) {
    // first time execution, get all files from drive folder
    var folder = DriveApp.getFolderById(id);
    var contents = folder.getFiles();
    // get the token and store it in a project property
    var continuationToken = contents.getContinuationToken();
    scriptProperties.setProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN', continuationToken);
  } else {
    // we continue to import from where we left
    var contents = DriveApp.continueFileIterator(continuationToken);
  }
  var file;
  var fileID;
  var name;
  var dateCreated;

  while(contents.hasNext()) {
    file = contents.next();
    fileID = file.getId();
    name = file.getName();
    dateCreated = file.getDateCreated();
    if(dateCreated > lastExecution) {
      processFiles(fileID);
    }
  }
  // Finished processing files so delete continuation token
  scriptProperties.deleteProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN');
  var currentExecution = Utilities.formatDate(new Date(), "GMT", "yyyy-MM-dd HH:mm:ss");
  scriptProperties.setProperty('LAST_EXECUTION',currentExecution);
};
like image 571
user3412868 Avatar asked Mar 12 '14 23:03

user3412868


People also ask

How to get the continuation token of a list?

Step 1:The first call returns the result set and the continuation token. Step 2:Second call with header set Sphere-Continuation=<yourContinuationtoken> Step 3:Finally you will get continuationToken=nullif the list is completed and no more data to be returned from the service. If there is any further update in this matter, then we will let you know.

How to get the continuation token from REST API call?

Please set the request header as Sphere-Continuation=<yourContinuationtoken> in your REST API call. Example: Step 1:The first call returns the result set and the continuation token. Step 2:Second call with header set Sphere-Continuation=<yourContinuationtoken>

What is the purpose of the continuation method?

This method is useful if processing an iterator in one execution would exceed the maximum execution time. Continuation tokens are generally valid for one week. FolderIterator — a collection of folders that remained in a previous iterator when the continuation token was generated


1 Answers

Like Jonathon said, you're comparing dates wrongly. But that's not the main issue with your script nor what you asked.

The main concept you're getting wrong is that the continuation token can't be saved before you do your loop. When you get the token, it saves where you were at that moment, if you continue iterating afterwards, that's not saved and you will repeat those steps later, just like you're experiencing.

To get the token later you cannot let your script terminate with an error. You have to measure how many files you can process under 5 minutes and stop your script manually before that, so you can have a chance at saving the token.

Here's the correct way of doing it:

function listFilesInFolder() {
  var MAX_FILES = 20; //use a safe value, don't be greedy
  var id = 'folder-id';
  var scriptProperties = PropertiesService.getScriptProperties();
  var lastExecution = scriptProperties.getProperty('LAST_EXECUTION');
  if( lastExecution === null )
    lastExecution = '';

  var continuationToken = scriptProperties.getProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN');
  var iterator = continuationToken == null ?
    DriveApp.getFolderById(id).getFiles() : DriveApp.continueFileIterator(continuationToken);


  try { 
    for( var i = 0; i < MAX_FILES && iterator.hasNext(); ++i ) {
      var file = iterator.next();
      var dateCreated = formatDate(file.getDateCreated());
      if(dateCreated > lastExecution)
        processFile(file);
    }
  } catch(err) {
    Logger.log(err);
  }

  if( iterator.hasNext() ) {
    scriptProperties.setProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN', iterator.getContinuationToken());
  } else { // Finished processing files so delete continuation token
    scriptProperties.deleteProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN');
    scriptProperties.setProperty('LAST_EXECUTION', formatDate(new Date()));
  }
}

function formatDate(date) { return Utilities.formatDate(date, "GMT", "yyyy-MM-dd HH:mm:ss"); }

function processFile(file) {
  var id = file.getId();
  var name = file.getName();
  //your processing...
  Logger.log(name);
}

Anyway, it may be possible that a file gets created between your runs and you do not get it on your continued-iteration. Then, by saving the execution time after your the last run, you may miss it on your next run too. I do not know your use-case, if it's acceptable to eventually reprocess some files or to miss some. If you can't have either situations at all, then the only solution I see is to save the ids of all files you have already processed. You may need to store those on a drive file, because PropertiesService may be too small for too many ids.

like image 187
Henrique G. Abreu Avatar answered Oct 18 '22 21:10

Henrique G. Abreu