How I convert U+0065
to UTF-32 format ?
U+0065
0000 0000 0110 0101
UTF-32
xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxxx
Convert U+0065 to UTF-32:
0000 0000 0000 0000 0000 0000 0110 0101
Result in hex is 0x00000065
Is that correct ?
Yes, it is correct.
UTF-32 is always written using 32 bits. Unicode defines codepoints up to U+10FFFF, which uses 21 bits. So a UTF-32 value is always the same as the codepoint itself.
Because U+0065 is in the U+0000..U+007F range, it is written in UTF-8 using 8 bits (01100101
). In UTF-16, it is the same using 16 bits (00000000 01100101
), and in UTF-32 using 32 bits (00000000 00000000 00000000 01100101
).
User contributions licensed under CC BY-SA 3.0