Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any Lua library that converts a string to bytes using UTF8 encoding?

Tags:

lua

I wonder whether this kind of library exists.

like image 646
Vicky Avatar asked May 03 '11 23:05

Vicky


People also ask

Are Lua strings utf8?

lua] supports all 5.3 string functions for UTF-8.

How do you convert bytes to UTF-8?

In order to convert a String into UTF-8, we use the getBytes() method in Java. The getBytes() method encodes a String into a sequence of bytes and returns a byte array. where charsetName is the specific charset by which the String is encoded into an array of bytes.


3 Answers

slnunicode is part of the collection of general purpose lua support libraries developed for the Selene database project.

It's also available as a luarock

like image 173
Doug Currie Avatar answered Sep 27 '22 15:09

Doug Currie


Lua strings are a sequence of bytes. When you store UTF8 text in them, you're already storing "UTF8 bytes". You can get the bytes like with all other strings, using string.byte(s,i,j):

local bytes = { string.byte(unicodeString, 1,-1) }

Now bytes is a table containing your "UTF8 bytes". More information about string.byte and UTF8 in Lua is available at:

Standard Lua string library

Lua 5.3 standard utf8 library

Presentation by Roberto Ierusalimschy (one of the creators of Lua) on the future of Lua, which talks about many things and one of them is UTF8 support. It was released before UTF8 support was built into Lua.

like image 38
negamartin Avatar answered Sep 27 '22 16:09

negamartin


Lua 5.3 has UTF-8 support in the standard library now.

For example, to get a UTF-8 string's code points:

for p, c in utf8.codes("瑞&于") do
  print(c)
end

Output:

29790
38
20110
like image 32
Yu Hao Avatar answered Sep 27 '22 16:09

Yu Hao