I want to get Kanji's Unicode value. It might be something looks like let values: &[u16] = f("ののの");
When I use "の".as_bytes() I got [227, 129, 174].
When I use 'の'.escape_unicode() I got '\u306e', the 0x306e is what exactly I want.
The char type can be cast to u32 using as. The line
println!("{:x}", 'の' as u32);
will print "306e" (using {:x} to format the number as hex).
If you are sure all your characters are in the BMP, you can in theory also cast directly to u16. For characters from supplementary planes this will silently give wrong results, though, e.g. '🝖' as u16 returns 0xf756 instead of the correct 0x1f756, so you need a strong reason to do this.
Internally, a char is stored as a 32-bit number, so c as u32 for some character c only reinterprets the memory representation of the character as an u32.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With