Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS boto3 page_iterator.search can't compare datetime.datetime to str

Trying to capture delta files(files created after last processing) sitting on s3. To do that using boto3 filter iterator by query LastModified value rather than returning all the list of files and filtering on the client site.

According to http://jmespath.org/?, the below query is valid and filters the following json respose;

filtered_iterator = page_iterator.search(
"Contents[?LastModified>='datetime.datetime(2016, 12, 27, 8, 5, 37, tzinfo=tzutc())'].Key")

for key_data in filtered_iterator:
print(key_data)

However it fails with;

RuntimeError: xxxxxxx has failed: can't compare datetime.datetime to str

Sample paginator reponse;

{
"Contents": [{
    "LastModified": "datetime.datetime(2016, 12, 28, 8, 5, 31, tzinfo=tzutc())",
    "ETag": "1022dad2540da33c35aba123476a4622",
    "StorageClass": "STANDARD",
    "Key": "blah1/blah11/abc.json",
    "Owner": {
        "DisplayName": "App-AWS",
        "ID": "bfc77ae78cf43fd1b19f24f99998cb86d6fd8220dbfce0ce6a98776253646656"
    },
    "Size": 623
}, {
    "LastModified": "datetime.datetime(2016, 12, 28, 8, 5, 37, tzinfo=tzutc())",
    "ETag": "1022dad2540da33c35abacd376a44444",
    "StorageClass": "STANDARD",
    "Key": "blah2/blah22/xyz.json",
    "Owner": {
        "DisplayName": "App-AWS",
        "ID": "bfc77ae78cf43fd1b19f24f99998cb86d6fd8220dbfce0ce6a81234e632c5a8c"
    },
    "Size": 702
}
]
}
like image 265
East2West Avatar asked Jan 11 '17 07:01

East2West


2 Answers

Boto3 Jmespath implementation does not support dates filtering (it will mark them as incompatible types "unicode" and "datetime" in your example). But by the way Dates are parsed by Amazon you can perform lexographical comparison of them using to_string() method of Jmespath.

Something like this:

"Contents[?to_string(LastModified)>='\"2015-01-01 01:01:01+00:00\"']"

But keep in mind that its a lexographical comparison and not dates comparison. Works most of the time tho.

like image 164
Daniel Hajduk Avatar answered Nov 09 '22 13:11

Daniel Hajduk


After spend a few minutes on boto3 paginator documentation, I just realist it is actually an syntax problem, which I overlook it as a string.

Actually, the quote that embrace comparison value on the right is a backquote/backtick, symbol [ ` ] . You cannot use single quote [ ' ] for the comparison values/objects.

After inspect JMESPath example, I notice it is using backquote for comparative value. So boto3 paginator implementation indeed comply to JMESPath standard.

Here is the code I run without error using the backquote.

import boto3 
s3 = boto3.client("s3")
s3_paginator = s3.get_paginator('list_objects')
s3_iterator = s3_paginator.paginate(Bucket='mytestbucket')
filtered_iterator = s3_iterator.search(
    "Contents[?LastModified >= `datetime.datetime(2016, 12, 27, 8, 5, 37, tzinfo=tzutc())`].Key"
    )
for key_data in filtered_iterator:
    print(key_data)
like image 41
mootmoot Avatar answered Nov 09 '22 12:11

mootmoot