Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lua - convert string to table

I want to convert string text to table and this text must be divided on characters. Every character must be in separate value of table, for example:

  • a="text"
  • --converting string (a) to table (b)
  • --show table (b)
  • b={'t','e','x','t'}
like image 824
user3074258 Avatar asked Dec 06 '13 12:12

user3074258


4 Answers

Just index each symbol and put it at same position in table.

local str = "text"
local t = {}
for i = 1, #str do
    t[i] = str:sub(i, i)
end
like image 65
Oleg V. Volkov Avatar answered Mar 02 '23 01:03

Oleg V. Volkov


You can below code to achieve this easily.

t = {}
str = "text"

for i=1, string.len(str) do
  t[i]= (string.sub(str,i,i))
end

for k , v in pairs(t) do
  print(k,v)
end

-- 1    t
-- 2    e
-- 3    x
-- 4    t

Using string.sub

string.sub(s, i [, j])

Return a substring of the string passed. The substring starts at i. If the third argument j is not given, the substring will end at the end of the string. If the third argument is given, the substring ends at and includes j.

like image 33
Prashant Gaur Avatar answered Mar 02 '23 01:03

Prashant Gaur


You could use string.gsub function

t={}
str="text"
str:gsub(".",function(c) table.insert(t,c) end)
like image 38
moteus Avatar answered Mar 02 '23 00:03

moteus


The builtin string library treats Lua strings as byte arrays. An alternative that works on multibyte (Unicode) characters is the unicode library that originated in the Selene project. Its main selling point is that it can be used as a drop-in replacement for the string library, making most string operations “magically” Unicode-capable.

If you prefer not to add third party dependencies your task can easily be implemented using LPeg. Here is an example splitter:

local lpeg       = require "lpeg"
local C, Ct, R   = lpeg.C, lpeg.Ct, lpeg.R
local lpegmatch  = lpeg.match

local split_utf8 do
  local utf8_x  = R"\128\191"
  local utf8_1  = R"\000\127"
  local utf8_2  = R"\194\223" * utf8_x
  local utf8_3  = R"\224\239" * utf8_x * utf8_x
  local utf8_4  = R"\240\244" * utf8_x * utf8_x * utf8_x
  local utf8    = utf8_1 + utf8_2 + utf8_3 + utf8_4
  local split   = Ct (C (utf8)^0) * -1

  split_utf8 = function (str)
    str = str and tostring (str)
    if not str then return end
    return lpegmatch (split, str)
  end
end

This snippet defines the function split_utf8() that creates a table of UTF8 characters (as Lua strings), but returns nil if the string is not a valid UTF sequence. You can run this test code:

tests = {
  en = [[Lua (/ˈluːə/ LOO-ə, from Portuguese: lua [ˈlu.(w)ɐ] meaning moon; ]]
    .. [[explicitly not "LUA"[1]) is a lightweight multi-paradigm programming ]]
    .. [[language designed as a scripting language with "extensible ]]
    .. [[semantics" as a primary goal.]],
  ru = [[Lua ([лу́а], порт. «луна») — интерпретируемый язык программирования, ]]
    .. [[разработанный подразделением Tecgraf Католического университета ]]
    .. [[Рио-де-Жанейро.]],
  gr = [[Η Lua είναι μια ελαφρή προστακτική γλώσσα προγραμματισμού, που ]]
    .. [[σχεδιάστηκε σαν γλώσσα σεναρίων με κύριο σκοπό τη δυνατότητα ]]
    .. [[επέκτασης της σημασιολογίας της.]],
  XX = ">\255< invalid"
}

-------------------------------------------------------------------------------

local limit = 14
for lang, str in next, tests do
  io.write "\n"
  io.write (string.format ("<%s %3d> ->", lang, #str))
  local chars = split_utf8 (str)
  if not chars then
    io.write " INVALID!"
  else
    io.write (string.format (" <%3d>", #chars))
    for i = 1, #chars > limit and limit or #chars do
      io.write (string.format (" %q", chars [i]))
    end
  end
end
io.write "\n"

Btw., building a table with LPeg is significantly faster than calling table.insert() repeatedly. Here are stats for splitting the whole of Gogol’s Dead Souls (in Russian, 1023814 bytes raw, 571395 characters UTF) on my machine:

library        method                time in ms
string         table.insert()        380
string         t [#t + 1] = c        310
string         gmatch & for loop     280
slnunicode     table.insert()        220
slnunicode     t [#t + 1] = c        200
slnunicode     gmatch & for loop     170
lpeg           Ct (C (...))           70
like image 24
Philipp Gesang Avatar answered Mar 01 '23 23:03

Philipp Gesang