Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bson.errors.InvalidDocument: key '$numberDecimal' must not start with '$' when using json

I have a small json file, with the following lines:

{
    "IdTitulo": "Jaws",
    "IdDirector": "Steven Spielberg",
    "IdNumber": 8,
    "IdDecimal": "2.33"
}

An there is a schema in my db collection, named test_dec. This is what I've used to create the schema:

db.createCollection("test_dec",
{validator: {
    $jsonSchema: {
         bsonType: "object",
         required: ["IdTitulo","IdDirector"],
         properties: {
         IdTitulo: {
                "bsonType": "string",
                "description": "string type, nombre de la pelicula"
            },
         IdDirector: {
                "bsonType": "string",
                "description": "string type, nombre del director"
            },
        IdNumber : {
                "bsonType": "int",
                "description": "number type to test"
            },
        IdDecimal : {
                 "bsonType": "decimal",
                 "description": "decimal type"
                    }
       }
    }}
    })

I've made multiple attempts to insert the data. The problem is in the IdDecimal field value.

Some of the trials, replacing the IdDecimal line by:

 "IdDecimal": 2.33

 "IdDecimal": {"$numberDecimal": "2.33"}

 "IdDecimal": NumberDecimal("2.33")

None of them work. The second one is the formal solution provided by MongoDB manuals (mongodb-extended-json) adn the error is the output I've placed in my question: bson.errors.InvalidDocument: key'$numberDecimal' must not start with '$'.

I am currently using a python to load the json. I've been playing around with this file:

import os,sys
import re
import io
import json
from pymongo import MongoClient
from bson.raw_bson import RawBSONDocument
from bson.json_util import CANONICAL_JSON_OPTIONS,dumps,loads
import bsonjs as bs

#connection
client = MongoClient('localhost',27018,document_class=RawBSONDocument)
db     = client['myDB']
coll   = db['test_dec']   
other_col = db['free']                                                                                        

for fname in os.listdir('/mnt/win/load'):                                                                               
    num = re.findall("\d+", fname)

    if num:

       with io.open(fname, encoding="ISO-8859-1") as f:

            doc_data = loads(dumps(f,json_options=CANONICAL_JSON_OPTIONS))

            print(doc_data) 

            test = '{"idTitulo":"La pelicula","idRelease":2019}'
            raw_bson = bs.loads(test)
            load_raw = RawBSONDocument(raw_bson)

            db.other_col.insert_one(load_raw)


client.close()

I am using a json file. If I try to parse anything like Decimal128('2.33') the output is "ValueError: No JSON object could be decoded", because my json has an invalid format.

The result of

    db.other_col.insert_one(load_raw) 

Is that the content of "test" is inserted. But I cannot use doc_data with RawBSONDocument, because it goes like that. It says:

  TypeError: unpack_from() argument 1 must be string or buffer, not list:

When I manage to parse the json directly to the RawBSONDocument I got all the trash within and the record in database looks like the sample here:

   {
    "_id" : ObjectId("5eb2920a34eea737626667c2"),
    "0" : "{\n",
    "1" : "\t\"IdTitulo\": \"Gremlins\",\n",
    "2" : "\t\"IdDirector\": \"Joe Dante\",\n",
    "3" : "\t\"IdNumber\": 6,\n",
    "4" : "\"IdDate\": {\"$date\": \"2010-06-18T:00.12:00Z\"}\t\n",
    "5" : "}\n"
     }

It seems it is not that simple to load a extended json into MongoDB. The extended version is because I want to use schema validation.

Oleg pointed out that is numberDecimal and not NumberDecimal as I had it before. I've fixed the json file, but nothing changed.

Executed:

with io.open(fname, encoding="ISO-8859-1") as f:
      doc_data = json.load(f)                
      coll.insert(doc_data)

And the json file:

 {
    "IdTitulo": "Gremlins",
    "IdDirector": "Joe Dante",
    "IdNumber": 6,
    "IdDecimal": {"$numberDecimal": "3.45"}
 }
like image 981
powerPixie Avatar asked Apr 29 '20 10:04

powerPixie


1 Answers

One more roll of the dice from me. If you are using schema validation as you are, I would recommend defining a class and being explicit with defining each field and how you propose to convert the field to the relevant python datatypes. While your solution is generic, the data structure has to be rigid to match the validation.

IMO this is clearer and you have control over any errors etc within the class.

Just to confirm I ran the schema validation and this works with the supplied validation.

from pymongo import MongoClient
import bson.json_util
import dateutil.parser
import json

class Film:
    def __init__(self, file):
        data = file.read()
        loaded = json.loads(data)
        self.IdTitulo  = loaded.get('IdTitulo')
        self.IdDirector = loaded.get('IdDirector')
        self.IdDecimal = bson.json_util.Decimal128(loaded.get('IdDecimal'))
        self.IdNumber = int(loaded.get('IdNumber'))
        self.IdDateTime = dateutil.parser.parse(loaded.get('IdDateTime'))

    def insert_one(self, collection):
        collection.insert_one(self.__dict__)

client = MongoClient()
mycollection = client.mydatabase.test_dec

with open('c:/temp/1.json', 'r') as jfile:
    film = Film(jfile)
    film.insert_one(mycollection)

gives:

> db.test_dec.findOne()
{
        "_id" : ObjectId("5eba79eabf951a15d32843ae"),
        "IdTitulo" : "Jaws",
        "IdDirector" : "Steven Spielberg",
        "IdDecimal" : NumberDecimal("2.33"),
        "IdNumber" : 8,
        "IdDateTime" : ISODate("2020-05-12T10:08:21Z")
}

>

JSON file used:

{
    "IdTitulo": "Jaws",
    "IdDirector": "Steven Spielberg",
    "IdNumber": 8,
    "IdDecimal": "2.33",
    "IdDateTime": "2020-05-12T11:08:21+0100"
}
like image 169
Belly Buster Avatar answered Oct 15 '22 03:10

Belly Buster