I've written a script to iterate through a large number of files in a Google Drive folder. Due to the processing I am doing on those files it exceeds the maximum execution time. Naturally I wrote into the script to use DriveApp.continueFileIterator(continuationToken): the token gets stored in the Project Properties and when the script runs it checks to see if there's a token, if there is it creates the FileIterator from the token if not it starts afresh.
What have I found is even though the script restarts with the continuation token it still starts from the beginning of the iteration, trying to process the same files again which wastes time for the subsequent executions. Have I missed something vital as in a command or method to make it start from where it left off? Am I supposed to update the continuation token at various stages thoughout the while(contents.hasNext()) loop?
Here's the sample code slimmed down to give you an idea:
function listFilesInFolder() {
var id= '0fOlDeRiDg';
var scriptProperties = PropertiesService.getScriptProperties();
var continuationToken = scriptProperties.getProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN');
var lastExecution = scriptProperties.getProperty('LAST_EXECUTION');
if (continuationToken == null) {
// first time execution, get all files from drive folder
var folder = DriveApp.getFolderById(id);
var contents = folder.getFiles();
// get the token and store it in a project property
var continuationToken = contents.getContinuationToken();
scriptProperties.setProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN', continuationToken);
} else {
// we continue to import from where we left
var contents = DriveApp.continueFileIterator(continuationToken);
}
var file;
var fileID;
var name;
var dateCreated;
while(contents.hasNext()) {
file = contents.next();
fileID = file.getId();
name = file.getName();
dateCreated = file.getDateCreated();
if(dateCreated > lastExecution) {
processFiles(fileID);
}
}
// Finished processing files so delete continuation token
scriptProperties.deleteProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN');
var currentExecution = Utilities.formatDate(new Date(), "GMT", "yyyy-MM-dd HH:mm:ss");
scriptProperties.setProperty('LAST_EXECUTION',currentExecution);
};
Step 1:The first call returns the result set and the continuation token. Step 2:Second call with header set Sphere-Continuation=<yourContinuationtoken> Step 3:Finally you will get continuationToken=nullif the list is completed and no more data to be returned from the service. If there is any further update in this matter, then we will let you know.
Please set the request header as Sphere-Continuation=<yourContinuationtoken> in your REST API call. Example: Step 1:The first call returns the result set and the continuation token. Step 2:Second call with header set Sphere-Continuation=<yourContinuationtoken>
This method is useful if processing an iterator in one execution would exceed the maximum execution time. Continuation tokens are generally valid for one week. FolderIterator — a collection of folders that remained in a previous iterator when the continuation token was generated
Like Jonathon said, you're comparing dates wrongly. But that's not the main issue with your script nor what you asked.
The main concept you're getting wrong is that the continuation token can't be saved before you do your loop. When you get the token, it saves where you were at that moment, if you continue iterating afterwards, that's not saved and you will repeat those steps later, just like you're experiencing.
To get the token later you cannot let your script terminate with an error. You have to measure how many files you can process under 5 minutes and stop your script manually before that, so you can have a chance at saving the token.
Here's the correct way of doing it:
function listFilesInFolder() {
var MAX_FILES = 20; //use a safe value, don't be greedy
var id = 'folder-id';
var scriptProperties = PropertiesService.getScriptProperties();
var lastExecution = scriptProperties.getProperty('LAST_EXECUTION');
if( lastExecution === null )
lastExecution = '';
var continuationToken = scriptProperties.getProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN');
var iterator = continuationToken == null ?
DriveApp.getFolderById(id).getFiles() : DriveApp.continueFileIterator(continuationToken);
try {
for( var i = 0; i < MAX_FILES && iterator.hasNext(); ++i ) {
var file = iterator.next();
var dateCreated = formatDate(file.getDateCreated());
if(dateCreated > lastExecution)
processFile(file);
}
} catch(err) {
Logger.log(err);
}
if( iterator.hasNext() ) {
scriptProperties.setProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN', iterator.getContinuationToken());
} else { // Finished processing files so delete continuation token
scriptProperties.deleteProperty('IMPORT_ALL_FILES_CONTINUATION_TOKEN');
scriptProperties.setProperty('LAST_EXECUTION', formatDate(new Date()));
}
}
function formatDate(date) { return Utilities.formatDate(date, "GMT", "yyyy-MM-dd HH:mm:ss"); }
function processFile(file) {
var id = file.getId();
var name = file.getName();
//your processing...
Logger.log(name);
}
Anyway, it may be possible that a file gets created between your runs and you do not get it on your continued-iteration. Then, by saving the execution time after your the last run, you may miss it on your next run too. I do not know your use-case, if it's acceptable to eventually reprocess some files or to miss some. If you can't have either situations at all, then the only solution I see is to save the ids of all files you have already processed. You may need to store those on a drive file, because PropertiesService may be too small for too many ids.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With