Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I batch delete with DynamoDB?

I am getting an error that "The provided key element does not match the schema". uuid is my primary partition key. I also have a primary sort key for version. I figured I can use batchWrite (docs) to delete all items with same uuid.

My ES6 code is as follows:

delete(uuid) {
  const promise = new Promise();
  const params = {
    RequestItems: {
      [this.TABLE]: [
        {
          DeleteRequest: {
            Key: { uuid: uuid }
          }
        }
      ]
    }
  };


  // this._client references the DocumentClient
  this._client.batchWrite(params, function(err, data) {
    if (err) {
      // this gets hit with error
      console.log(err);
      return promise.reject(err);
    }

    console.log(result);
    return promise.resolve(result);
  });

  return promise;
}

Not sure why it is erroring on the key that is the primary. I have seen posts about needing other indexes for times when I am searching by something that isn't a key. But I don't believe that's the case here.

like image 276
Dave Stein Avatar asked Jul 19 '16 17:07

Dave Stein


People also ask

How do I delete bulk data from DynamoDB?

With the DynamoDB API, you use the DeleteItem action to delete data from a table, one item at a time. You must specify the item's primary key values. In addition to DeleteItem , Amazon DynamoDB supports a BatchWriteItem action for deleting multiple items at the same time.

How do you empty a table in DynamoDB?

Delete Table using the GUI Console. https://console.aws.amazon.com/dynamodb. Choose Tables from the navigation pane, and choose the table desired for deletion from the table list as shown in the following screeenshot. Finally, select Delete Table.

Can we do batch update in DynamoDB?

A bulk (batch) update refers to updating multiple rows belonging to a single table. However, DynamoDB does not provide the support for this.

How do I delete a column in DynamoDB?

Navigate to the console. In the navigation pane on the left side, select Tables. Then select the table name, and the Items tab. Choose the items desired for deletion, and select Actions | Delete.


Video Answer


3 Answers

Here is the batch write delete request sample. This code has been tested and working fine. If you change this code for your requirement, it should work.

Table Definition:-

Bag - Table Name

bag - Hash Key

No partition key in 'Bag' table

Batch Write Code:-

var AWS = require("aws-sdk");

AWS.config.update({
    region : "us-west-2",
    endpoint : "http://localhost:8000"
});

var documentclient = new AWS.DynamoDB.DocumentClient();

var itemsArray = [];

var item1 = {
    DeleteRequest : {
        Key : {
            'bag' : 'b1'    
        }
    }
};

itemsArray.push(item1);

var item2 = {
    DeleteRequest : {
        Key : {
            'bag' : 'b2'    
        }
    }
};

itemsArray.push(item2);

var params = {
    RequestItems : {
        'Bag' : itemsArray
    }
};
documentclient.batchWrite(params, function(err, data) {
    if (err) {
        console.log('Batch delete unsuccessful ...');
        console.log(err, err.stack); // an error occurred
    } else {
        console.log('Batch delete successful ...');
        console.log(data); // successful response
    }

});

Output:-

Batch delete successful ...
{ UnprocessedItems: {} }
like image 142
notionquest Avatar answered Oct 22 '22 06:10

notionquest


This is doable with Node lambda, but there are a few things you need to consider to address concurrency while processing large databases:

  • Handle paging while querying all of the matching elements from a secondary index
  • Split into chunks of 25 requests as per BatchWrite/Delete requirements https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html
  • Above 40,000 matches you might need a 1 second delay between cycles https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html

Here a snipped that I wrote:

const AWS = require("aws-sdk");
const dynamodb = new AWS.DynamoDB.DocumentClient();
const log = console.log;

exports.handler = async (event) => {

  log(event);
  let TableName = event.tableName;
  let params = {
  let TableName,
        FilterExpression: "userId = :uid",
        ExpressionAttributeValues: {
          ":uid": event.userId,
        },
      };
  let getItems = async (lastKey, items) => {
        if (lastKey) params.ExclusiveStartKey = lastKey;
        let resp = await dynamodb.scan(params).promise();
        let items = resp.Items.length
               ? items.concat(resp.Items.map((x) => x.id))
               : items;
        if (resp.LastEvaluatedKey)
          return await getItems(resp.LastEvaluatedKey, items);
        else return items;
      };
  let ids = await getItems(null, []);
  let idGroups = [];

  for (let i = 0; i < ids.length; i += 25) {
    idGroups.push(ids.slice(i, i + 25));
  }

  for (const gs of idGroups) {
    let delReqs = [];
    for (let id of gs) {
      delReqs.push({ DeleteRequest: { Key: { id } } });
    }
    let RequestItems = {};
    RequestItems[TableName] = delReqs;
    let d = await dynamodb
      .batchWrite({ RequestItems })
      .promise().catch((e) => log(e));
  }
  log(ids.length + " items processed");
  return {};
};

like image 21
Mike Bendorf Avatar answered Oct 22 '22 08:10

Mike Bendorf


Not sure why nobody provided a proper answer.

Here's a lambda I did in nodeJS. It will perform a full scan on the table, then batch delete every 25 items per request.

Remember to change TABLE_NAME.

const AWS = require('aws-sdk');

const docClient = new AWS.DynamoDB.DocumentClient({ apiVersion: '2012-08-10' });

//const { TABLE_NAME } = process.env;
TABLE_NAME = "CHANGE ME PLEASE"

exports.handler = async (event) => {
    let params = {
        TableName: TABLE_NAME,
    };

    let items = [];
    let data = await docClient.scan(params).promise();
    items = [...items, ...data.Items];

    while (typeof data.LastEvaluatedKey != 'undefined') {
        params.ExclusiveStartKey = data.LastEvaluatedKey;

        data = await docClient.scan(params).promise();
        items = [...items, ...data.Items];
    }

    let leftItems = items.length;
    let group = [];
    let groupNumber = 0;

    console.log('Total items to be deleted', leftItems);

    for (const i of items) {
        const deleteReq = {
            DeleteRequest: {
                Key: {
                    id: i.id,
                },
            },
        };

        group.push(deleteReq);
        leftItems--;

        if (group.length === 25 || leftItems < 1) {
            groupNumber++;

            console.log(`Batch ${groupNumber} to be deleted.`);

            const params = {
                RequestItems: {
                    [TABLE_NAME]: group,
                },
            };

            await docClient.batchWrite(params).promise();

            console.log(
                `Batch ${groupNumber} processed. Left items: ${leftItems}`
            );

            // reset
            group = [];
        }
    }

    const response = {
        statusCode: 200,
        //  Uncomment below to enable CORS requests
        //  headers: {
        //      "Access-Control-Allow-Origin": "*"
        //  },
        body: JSON.stringify('Hello from Lambda!'),
    };
    return response;
};

like image 3
NubPro Avatar answered Oct 22 '22 08:10

NubPro