Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

boto3 KeyConditionExpressions must only contain one condition per key

Tags:

boto3

I have a trouble with following code. The most difficult to understand is that the expression alway happened when the many query operation happened in a short time interval.
The experssion is as follows:

2017-03-05 15:03:59,053 data_sync_worker.py[line:83] ERROR An error occurred (ValidationException) when calling the Query operation: KeyConditionExpressions must only contain one condition per key
ClientError: An error occurred (ValidationException) when calling the Query operation: KeyConditionExpressions must only contain one condition per key

And here is my code:

response = self.record_tb.query(
                KeyConditionExpression=Key(self.partition_key).eq(user_id) &
                Key(self.sort_key).between(
                    begin_time+Decimal(CACHE_TIMESTAMP_MIN_STEP),
                    endtime))

And here is the table key schema:

"KeySchema": [
    {
        "KeyType": "HASH", 
        "AttributeName": "user_id"
    }, 
    {
        "KeyType": "RANGE", 
        "AttributeName": "timestamp"
    }
]

So, has anyone met this?

like image 752
Gary Avatar asked Mar 07 '17 03:03

Gary


2 Answers

This error message above can occur from having a malformed query in code, and for many people, that's a reasonable explanation. However, I've confirmed that you can also get this mysterious and very misleading error message if you run dynamodb query using a shared Table resource with multiple threads or tasks under a lot of load. That's what I think is happening in the OP's case. I've seen this happen both with boto3 1.9.82 along with the pathos library and with asyncio in python 3.6.

To wit, this is something many of us have suspected for a long time - boto3 isn't completely thread-safe even if it does often work in practice.

In this particular case, I suspect there's some state that gets corrupted during the query-building process such that the query that actually gets submitted to the service endpoint is invalid. I've not been able to reproduce this on demand; re-running the same code a second time always seems to work. It would be possible to use the botocore logger to capture the actual payloads sent to AWS - that would prove my theory. However it's really expensive on my end to capture such a large volume of logs, so I just stopped using shared Table resources and I stopped seeing the error.

like image 57
killthrush Avatar answered Nov 03 '22 19:11

killthrush


@killthrush answer turned out to be the cause of this error for us. Basically, as he points out, it appears boto3 is not thread-safe and we were reading from s3 from multiple concurrent threads.

If you are looking for a quick fix, I found the comment by @pedros007 in the link he supplied to work. Basically when you setup the boto3 s3 client, set the max_pool_connecetions to the amount of workers you are running and so far we have stopped getting the ValidationException error.

# code from pedros007
num_threads = 16
cfg = botocore.config.Config(max_pool_connections=num_threads)
client = boto3.client("s3", config=cfg)
futures = {}
with ThreadPoolExecutor(max_workers = num_threads) as executor:
    for key in keys:
        f = executor.submit(my_head_object_function, key, client)
        futures[f] = key

Others have suggested starting a new session in each thread, which I've tried, and does work but there is a large performance hit. With the above method the performance is the same.

like image 33
jeffjv Avatar answered Nov 03 '22 19:11

jeffjv