Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write JSON data to Dynamodb by ignoring empty elements in boto3

I would like to write the following data group to Dynamodb.
There are about 100 data. Since images are not necessarily required, there is a mixture with and without the image_url element.

(questionsList.json)

{
  "q_id" : "001",
  "q_body" : "Where is the capital of the United States?",
  "q_answer" : "Washington, D.C.",
  "image_url" : "/Washington.jpg",
  "keywords" : [
    "UnitedStates",
    "Washington"
  ]
},
{
  "q_id" : "002",
  "q_body" : "Where is the capital city of the UK?",
  "q_answer" : "London",
  "image_url" : "",
  "keywords" : [
    "UK",
    "London"
  ]
},

Since it is the writing test phase, Dynamodb to write to is prepared in localhost:8000 using the serverless-dynamodb-local plugin of the serverless framework, not the production environment.
In order to write the above JSON data to this Dynamodb, I wrote the following code in Boto 3 (AWS SDK for Python).

from __future__ import print_function
import boto3
import codecs
import json

dynamodb = boto3.resource('dynamodb', region_name='us-east-1', endpoint_url="http://localhost:8000")

table = dynamodb.Table('questionListTable')

with open("questionList.json", "r", encoding='utf-8') as json_file:
    items = json.load(json_file)
    for item in items:
        q_id = item['q_id']
        q_body = item['q_body']
        q_answer = item['q_answer']
        image_url = item['image_url']
        keywords = item['keywords']

        print("Adding detail:", q_id, q_body)

        table.put_item(
            Item={
                'q_id': q_id,
                'q_body': q_body,
                'q_answer': q_answer,
                'image_url': image_url,
                'keywords': keywords,
            }
        )

When this code is executed, the following error occurs in the null character part.

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the PutItem operation: One or more parameter values were invalid: An AttributeValue may not contain an empty string

Apparently it seems to be caused by JSON 's null character.
If you exclude the image_url containing the null character from the target of writing as below, the writing is completed without any problem.

from __future__ import print_function
import boto3
import codecs
import json

dynamodb = boto3.resource('dynamodb', region_name='us-east-1', endpoint_url="http://localhost:8000")

table = dynamodb.Table('questionListTable')

with open("questionList.json", "r", encoding='utf-8') as json_file:
    items = json.load(json_file)
    for item in items:
        q_id = item['q_id']
        q_body = item['q_body']
        q_answer = item['q_answer']
        #image_url = item['image_url']
        keywords = item['keywords']

        print("Adding detail:", q_id, q_body)

        table.put_item(
            Item={
                'q_id': q_id,
                'q_body': q_body,
                'q_answer': q_answer,
                #'image_url': image_url,
                'keywords': keywords,
            }
        )

Since DynamoDB is NoSQL, there may be other methods that make good use of the characteristics, but how to correct the code to write the above data ignoring empty characters? I would like to say "if image_url exists, write it if it does not, ignore it."

Thank you.

like image 299
uhiyama Avatar asked Aug 18 '17 01:08

uhiyama


People also ask

Can DynamoDB have NULL values?

Dynamodb can't accept a key with an explicitly NULL value.

Can we store JSON object in DynamoDB?

You can store a JSON document as an attribute in a DynamoDB table. To do this, use the withJSON method of Item . This method parses the JSON document and maps each element to a native DynamoDB data type.

What is the data type for JSON in DynamoDB?

DynamoDB Data Type Conversions to JSON Specifically: DynamoDB sets (the SS , NS , and BS types) will be converted to JSON arrays. DynamoDB binary scalars and sets (the B and BS types) will be converted to base64-encoded JSON strings or lists of strings.


1 Answers

I solved my problem. You can set null as follows.

from __future__ import print_function
import boto3
import codecs
import json

dynamodb = boto3.resource('dynamodb', region_name='ap-northeast-1', endpoint_url="http://localhost:8000")

table = dynamodb.Table('questionListTable')

with open("questionList.json", "r", encoding='utf-8_sig') as json_file:
    items = json.load(json_file)
    for item in items:
        q_id = item['q_id']
        q_body = item['q_body']
        q_answer = item['q_answer']
        image_url = item['image_url'] if item['image_url'] else None
        keywords = item['keywords'] if item['keywords'] else None

    print("Adding detail:", q_id, q_body)

    table.put_item(
        Item={
            'q_id': q_id,
            'q_body': q_body,
            'q_answer': q_answer,
            'image_url': image_url,
            'keywords': keywords,
        }
    )

In order to check the situation of Dynamodb, use the offline plugin of the serverless framework to run the API Gateway in the local environment. When I actually called the API using Postman, Null was properly inserted in the value.

{
  "q_id" : "001",
  "q_body" : "Where is the capital of the United States?",
  "q_answer" : "Washington, D.C.",
  "image_url" : "/Washington.jpg",
  "keywords" : [
    "UnitedStates",
    "Washington"
  ]
},
{
  "q_id" : "002",
  "q_body" : "Where is the capital city of the UK?",
  "q_answer" : "London",
  "image_url" : "null",
  "keywords" : [
    "UK",
    "London"
  ]
},
like image 163
uhiyama Avatar answered Sep 29 '22 06:09

uhiyama