Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Enforcing Unique Constraint in GAE

I am trying out Google App Engine Java, however the absence of a unique constraint is making things difficult. I have been through this post and this blog suggests a method to implement something similar. My background is in MySQL.Moving to datastore without a unique constraint makes me jittery because I never had to worry about duplicate values before and checking each value before inserting a new value still has room for error.

"No, you still cannot specify unique during schema creation."

-- David Underhill talks about GAE and the unique constraint (post link)

What are you guys using to implement something similar to a unique or primary key?

I heard about a abstract datastore layer created using the low level api which worked like a regular RDB, which however was not free(however I do not remember the name of the software)

Schematic view of my problem

sNo = biggest serial_number in the db
sNo++
Insert new entry with sNo as serial_number value //checkpoint
User adds data pertaining to current serial_number 
Update entry with data where serial_number is sNo 

However at line number 3(checkpoint), I feel two users might add the same sNo. And that is what is preventing me from working with appengine.

like image 940
abel Avatar asked Oct 04 '10 13:10

abel


1 Answers

This and other similar questions come up often when talking about transitioning from a traditional RDB to a BigTable-like datastore like App Engine's.

It's often useful to discuss why the datastore doesn't support unique keys, since it informs the mindset you should be in when thinking about your data storage schemes. The reason unique constraints are not available is because it greatly limits scalability. Like you've said, enforcing the constraint means checking all other entities for that property. Whether you do it manually in your code or the datastore does it automatically behind the scenes, it still needs to happen, and that means lower performance. Some optimizations can be made, but it still needs to happen in one way or another.

The answer to your question is, really think about why you need that unique constraint.

Secondly, remember that keys do exist in the datastore, and are a great way of enforcing a simple unique constraint.

my_user = MyUser(key_name=users.get_current_user().email())
my_user.put()

This will guarantee that no MyUser will ever be created with that email ever again, and you can also quickly retrieve the MyUser with that email:

my_user = MyUser.get(users.get_current_user().email())

In the python runtime you can also do:

my_user = MyUser.get_or_create(key_name=users.get_current_user().email())

Which will insert or retrieve the user with that email.

Anything more complex than that will not be scalable though. So really think about whether you need that property to be globally unique, or if there are ways you can remove the need for that unique constraint. Often times you'll find with some small workarounds you didn't need that property to be unique after all.

like image 181
Jason Hall Avatar answered Sep 20 '22 20:09

Jason Hall