I am using .NET, and I need to truncate a string that may contain multibyte characters so that it will not be over a set length once it is URL encoded. This seems like something that would be built in, but I can't find it.
I would just do a substring once it is URL encoded, but that might take off part of a encoded character (space becomes %20, and if it was at the end it could get truncated to %2, which is invalid), or that part of a multibyte character would get truncated (π gets encoded as %CF%80, and it could get truncated as %, %CF, %CF%8, all of which are wrong).
My quick Google search didn't turn up anything for this, which is slightly surprising since this seems like a relatively common problem (at least for those who don't avoid monstrously long URLs).
You could do this iteratively where you encode the string, and if the encoded string is too long you chop a character off the original and re-encode, and keep doing this until the encoded string is short enough. This would obviously not be very performant.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With