Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Acceptable encoding for Cosmos DB IDs to replace illegal characters?

I'm trying to store data in Cosmos DB where the IDs use a slash (/). However slash is an illegal character in Cosmos IDs. I initially tried to resolve this by URL encoding slashes (%2F) as that's the form I'd generally receive them in through API requests. However, though percent (%) is not an illegal character for IDs, Cosmos still chokes on them being unable to retrieve many documents with a percent in the ID (it works for some, but it appears if the % is followed by certain characters it fails).

Is there a encoding that is suitable for Cosmos DB IDs that will replace illegal characters in the original ID text without introducing illegal or unhandled characters (like %) in the encoded ID text? I'd prefer to stay away from things like Base64 which makes the IDs hard to decipher for people. And I'd also like to avoid simple character replacement (/ becomes -) in case an ID uses the replacement character.

like image 426
Rob Mosher Avatar asked Sep 15 '25 04:09

Rob Mosher


1 Answers

I ended up doing simple character replacement, swapping out slashes (/) with pipes (|).

The key thing to make this livable is adding a value converter with EntityFramework.

Expression<Func<string?, string>> toDB = v => v!.Replace("/", "|");
Expression<Func<string, string?>> fromDB = v => v!.Replace("|", "/");
builder.Property(p => p.Id).HasConversion(toDB, fromDB);

This allows the character replacement to happen automatically when reading & writing to the database. The only time you need to worry about the difference is if you're accessing the database directly or from other code without the converter. Or possibly doing custom searches. I manually do the translation for a filtering framework we use, and I suspect that other id search solutions would need the same manual translation.

Ultimately I decided this was acceptable as we are unlikely to have other characters that need translation for our case, the translation is easy to do visually, and it's transparent in most cases with ValueConverters. But it isn't a general solution that would work for any possible string id.

Edit: On second thought, this solution is deficient. Cosmos does actually allow creating documents with illegal characters in the ID, it just doesn't allow accessing or deleting them easily. An ideal solution would prevent all illegal characters across the board, whether expected or not.

like image 130
Rob Mosher Avatar answered Sep 17 '25 20:09

Rob Mosher