Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongo UUID python vs java format

I have an application that sends requests to a restAPI, where a java process stores the data in mongo. When I try to read this data back using pymongo, reading the database directly, it gets the UUIDs differently (seems it is due to different encoding in java/python).

Is there a way to convert this UUID back and forth?

EDIT:

A few examples:

in java: 38f51c1d-360e-42c1-8f9a-3f0a9d08173d, 1597d6ea-8e5f-473b-a034-f51de09447ec

in python: c1420e36-1d1c-f538-3d17-089d0a3f9a8f, 3b475f8e-ead6-9715-ec47-94e01df534a0

thanks,

like image 911
jamborta Avatar asked May 20 '26 09:05

jamborta


1 Answers

I spent a day of my life trying to tackle this same issue...

The root problem is likely that your Java code is storing the UUIDs in the Mongo database with the Java drivers using the legacy UUID3 standard. To verify, you just login with the Mongo shell and look at the raw output of your UUIDs. If there's a 3, then that's the issue.

db.my_collection_name.find().limit(1)
...BinData(3,"blahblahblahblahblah"),...

With UUID3, Mongo decided to do everything different with all their drivers based on the given language. (thanks Mongo…) It wasn’t until UUID4 that Mongo decided to standardize across all their different drivers for various languages. Ideally you should probably switch to UUID4, but that’s a more impactful solution, so not necessarily practical. REFERENCE: http://3t.io/blog/best-practices-uuid-mongodb/

Not to worry, there’s hope! The magic technique to make it all work involves simply pulling the collection with the JAVA_LEGACY uuid specification in the CodecOptions.

my_collection = db.get_collection('MyCollectionName', CodecOptions(uuid_representation=JAVA_LEGACY))

After that you can query with the UUIDs from your APIs and your query results will also have the UUIDs in the same format as your APIs.

Here is a complete query example using this technique.

import pprint
import uuid
from bson.binary import JAVA_LEGACY
from bson.codec_options import CodecOptions
from pymongo import MongoClient

PP = pprint.PrettyPrinter(indent=2)

client = MongoClient('localhost', 27017)
db = client.my_database

# REFERENCES: http://3t.io/blog/best-practices-uuid-mongodb/  |  http://api.mongodb.org/python/current/api/bson/binary.html
my_collection = db.get_collection('my_collection', CodecOptions(uuid_representation=JAVA_LEGACY))

my_java_uuid3 = "bee4ecb8-11e8-4267-8885-1bf7657fe6b7"
results = list(my_collection.find({"my_uuid": uuid.UUID(my_java_uuid3)}))

if results and len(results) > 0:
    for result in results:
        PP.pprint(result)
like image 143
Rob.Kachmar Avatar answered May 22 '26 21:05

Rob.Kachmar