If I create a named key for use in Google AppEngine, what kind of String is the key-name? Does it use Unicode characters or is it a binary string?
More specifically, if I want to have my key-name made up of 8-bit binary data, is there a way I can do it? If not, can I at least use 7-bit binary data? Or are there any reserved values? Does it use NULL as the End-Of-String marker, for example?
GAE docs do not specify any restrictions on the key-name String. So a String with any content should be valid.
If you want to use a binary data as an identifier, then you should encode it into a String. You can use any of the binary-to-text encoding methods: most used seem to be Base64 (3 bytes = 4 chars) and BinHex (1 byte = 2 chars).
I meanwhile had some time to actually test this out by generating a bunch of keys with binary names and then performing a kind-only query to get all the keys back. Here are the results:
"\x00\x13\x127\x255"
, a query will find this entity and its key name will return that same string"\x00test"
and \x00\x00test
will both show up as separate entities, but their keys are both shown as "test"
From this, I would derive the following recommendations:
Asides:
I will accept @PeterKnego's answer instead of this one since this one basically only confirms and expands on what he already assumed correctly.
From looking through the source code of the Java API, I think that the UTF-8 encoding of the key-name happens in the API (while building the protocol buffer) rather than in BigTable, so if you really want to go nuts on storage space maximization, it may be possible to build your own protocol buffers and store full 8-bit data without overhead. But this is probably asking for trouble...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With