Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lua - read one UTF-8 character from file

Is it possible to read one UTF-8 character from file?

file:read(1) return weird characters instead, when i print it.

function firstLetter(str)
  return str:match("[%z\1-\127\194-\244][\128-\191]*")
end

Function returns one UTF-8 character from string str. I need to read one UTF-8 character this way, but from input file (don't want to read certain file to the memory - via file:read("*all"))

Question is pretty similar to this post: Extract the first letter of a UTF-8 string with Lua

like image 385
Hrablicky Avatar asked Nov 01 '22 05:11

Hrablicky


1 Answers

function read_utf8_char(file)
  local c1 = file:read(1)
  local ctr, c = -1, math.max(c1:byte(), 128)
  repeat
    ctr = ctr + 1
    c = (c - 128)*2
  until c < 128
  return c1..file:read(ctr)
end
like image 135
Egor Skriptunoff Avatar answered Nov 15 '22 03:11

Egor Skriptunoff