If I need to have the following python value, unicode char '0':
>>> unichr(0)
u'\x00'
How can I define it in Lua?
There isn't one.
Lua has no concept of a Unicode value. Lua has no concept of Unicode at all. All Lua strings are 8-bit sequences of "characters", and all Lua string functions will treat them as such. Lua does not treat strings as having any Unicode encoding; they're just a sequence of bytes.
You can insert an arbitrary number into a string. For example:
"\065\066"
Is equivalent to:
"AB"
The \
notation is followed by 3 digits (or one of the escape characters), which must be less than or equal to 255. Lua is perfectly capable of handling strings with embedded \000
characters.
But you cannot directly insert Unicode codepoints into Lua strings. You can decompose the codepoint into UTF-8 and use the above mechanism to insert the codepoint into a string. For example:
"x\226\131\151"
This is the x
character followed by the Unicode combining above arrow character.
But since no Lua functions actually understand UTF-8, you will have to expose some function that expects a UTF-8 string in order for it to be useful in any way.
How about
function unichr(ord)
if ord == nil then return nil end
if ord < 32 then return string.format('\\x%02x', ord) end
if ord < 126 then return string.char(ord) end
if ord < 65539 then return string.format("\\u%04x", ord) end
if ord < 1114111 then return string.format("\\u%08x", ord) end
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With