I believe Windows currently defaults to UTF-16 for “Unicode”, but that this may not be the case in the future.
For this reason, would it be better to use
[System.Text.Encoding]::UTF8.GetString($someByteArray)
instead of the following?:
[System.Text.Encoding]::Unicode.GetString($someByteArray)
this may not be the case in the future.
Unicode isn't a potentially-variable encoding; it's just Microsoft's (sadly misleading) name for UTF-16LE.
It isn't going to change. Even if Microsoft moved towards implementing Windows APIs natively in UTF-8 or UTF-32 (something there's no sign of ever happening), System.Text.Encoding.Unicode would have to remain UTF-16LE as that is how it is defined by the .NET specification.
would it be better to use
UTF8instead ofUnicode?
Use UTF8 if the byte array contains UTF-8-encoded bytes, and use Unicode if they are in UTF-16LE.
If you get to choose what encoding is used to store data at rest, UTF-8 is usually the better choice for space efficiency reasons.
First, yes Windows defaults to UTF-16. Personally I would use UTF-8, because most of the applications I write have to communicate with Linux applications or some form of http so UTF-8 is more likely.
Besides even if all your code is used with Microsoft systems it's easy to convert to UTF-8 and a simple substitute regular expression could change everything over to Unicode (UTF-16) if .NET started requiring it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With