Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DynamoDB - Put item if hash (or hash and range combination) doesn't exist

Here are my use cases: I have a Dynamo table with a hash + range key. When I put new items in the table, I want to do a uniqueness check. Sometimes I want to guarantee that the hash is unique (ignoring the range). Other times I want to allow duplicate hashes, but guarantee that the hash and range combination is unique. How can I accomplish this?

I experimented with attribute_not_exists. It seems to handle the second case, where it checks the hash + key combination. Here's a PHP sample:

$client->putItem(array(     'TableName' => 'test',     'Item' => array(         'hash' => array('S' => 'abcdefg'),         'range' => array('S' => 'some other value'),         'whatever' => array('N' => 233)     ),     'ConditionExpression' => 'attribute_not_exists(hash)' )); 

Oddly, it doesn't seem to matter if I use attribute_not_exists(hash) or attribute_not_exists(range). They both seem to do exactly the same thing. Is this how it's supposed to work?

Any idea how to handle the case where I only want to check hash for uniqueness?

like image 279
mrog Avatar asked Sep 28 '15 23:09

mrog


People also ask

Is Range key mandatory in DynamoDB?

A primary key is consists of a hash key and an optional range key. Hash key is used to select the DynamoDB partition. Partitions are parts of the table data. Range keys are used to sort the items in the partition, if they exist.

Can range key be null in DynamoDB?

Can the DynamoDB sort key be null? DynamoDB does not support null for sort key.

Can DynamoDB have 2 primary keys?

DynamoDB supports two types of primary keys: Partition key: A simple primary key, composed of one attribute known as the partition key.

Can a DynamoDB table have multiple hash keys?

Using normal DynamoDB operations you're allowed to query either only one hash key per request (using GetItem or Query operations) or all hash keys at once (using the Scan operation).


1 Answers

You can't. All items in DynamoDB are indexed by either their hash or hash+range (depending on your table).

A sort of summary of what is going on so far:

  • A single hash key can have multiple range keys.
  • Every item has both a hash and a range key
  • You are making a PutItem request and must provide both the hash and range
  • You are providing a ConditionExpression with attribute_not_exists on either the hash or range attribute name
  • The attribute_not_exists condition is merely checking if an attribute with that name exists, it doesn't care about the value

Let's walk through an example. Let's start with a hash+range key table with this data:

  1. hash=A,range=1
  2. hash=A,range=2

There are four possible cases:

  1. If you try to put an item with hash=A,range=3 and attribute_not_exists(hash), the PutItem will succeed because attribute_not_exists(hash) evaluates to true. No item exists with key hash=A,range=3 that satisfies the condition of attribute_not_exists(hash).

  2. If you try to put an item with hash=A,range=3 and attribute_not_exists(range), the PutItem will succeed because attribute_not_exists(range) evaluates to true. No item exists with key hash=A,range=3 that satisfies the condition of attribute_not_exists(range).

  3. If you try to put an item with hash=A,range=1 and attribute_not_exists(hash), the PutItem will fail because attribute_not_exists(hash) evaluates to false. An item exists with key hash=A,range=1 that does not satisfy the condition of attribute_not_exists(hash).

  4. If you try to put an item with hash=A,range=1 and attribute_not_exists(range), the PutItem will fail because attribute_not_exists(range) evaluates to false. An item exists with key hash=A,range=1 that does not satisfy the condition of attribute_not_exists(range).

This means that one of two things will happen:

  1. The hash+range pair exists in the database.
    • attribute_not_exists(hash) must be true
    • attribute_not_exists(range) must be true
  2. The hash+range pair does not exist in the database.
    • attribute_not_exists(hash) must be false
    • attribute_not_exists(range) must be false

In both cases, you get the same result regardless of whether you put it on the hash or the range key. The hash+range key identifies a single item in the entire table, and your condition is being evaluated on that item.

You are effectively performing a "put this item if an item with this hash+range key does not already exist".

like image 162
mkobit Avatar answered Oct 16 '22 20:10

mkobit