Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What can I do to increase the performance of a Lua program?

I asked a question about Lua perfromance, and on of the responses asked:

Have you studied general tips for keeping Lua performance high? i.e. know table creation and rather reuse a table than create a new one, use of 'local print=print' and such to avoid global accesses.

This is a slightly different question from Lua Patterns,Tips and Tricks because I'd like answers that specifically impact performance and (if possible) an explanation of why performance is impacted.

One tip per answer would be ideal.

like image 372
Jon Ericson Avatar asked Sep 30 '08 19:09

Jon Ericson


People also ask

How can I make Lua faster?

Avoid globals Accessing them means you have to access a table index. While Lua has a pretty good hashtable implementation, it's still a lot slower than accessing a local variable. If you have to use globals, assign their value to a local variable, this is faster at the 2nd variable access.

How fast does Lua run?

Some related speed figures: Many parser functions run at ~1,250 per second (such as: #if #ifeq #ifexpr ) Short templates run at hundreds per second. Character-insertion templates can run at 2,400 per second, such as {{nb5}}.

What can I do with Lua programming?

Lua can be used in everyday applications to extend the existing functionality or create new features and functions. Some popular games, programs, and services that use Lua are Dark Souls, Fable II, Garry's Mod, Wireshark, VLC, Apache, and Nginx Web Servers.

Is Lua faster than Java?

However, lua itself, i.e. lua-without-JIT, is probably one of the fastest scripting language. lua is faster than Java-without-JIT. lua is faster than Javascript-without-JIT.


2 Answers

In response to some of the other answers and comments:

It is true that as a programmer you should generally avoid premature optimization. But. This is not so true for scripting languages where the compiler does not optimize much -- or at all.

So, whenever you write something in Lua, and that is executed very often, is run in a time-critical environment or could run for a while, it is a good thing to know things to avoid (and avoid them).

This is a collection of what I found out over time. Some of it I found out over the net, but being of a suspicious nature when the interwebs are concerned I tested all of it myself. Also, I have read the Lua performance paper at Lua.org.

Some reference:

  • Lua Performance Tips
  • Lua-users.org Optimisation Tips

Avoid globals

This is one of the most common hints, but stating it once more can't hurt.

Globals are stored in a hashtable by their name. Accessing them means you have to access a table index. While Lua has a pretty good hashtable implementation, it's still a lot slower than accessing a local variable. If you have to use globals, assign their value to a local variable, this is faster at the 2nd variable access.

do   x = gFoo + gFoo; end do -- this actually performs better.   local lFoo = gFoo;   x = lFoo + lFoo; end 

(Not that simple testing may yield different results. eg. local x; for i=1, 1000 do x=i; end here the for loop header takes actually more time than the loop body, thus profiling results could be distorted.)

Avoid string creation

Lua hashes all strings on creation, this makes comparison and using them in tables very fast and reduces memory use since all strings are stored internally only once. But it makes string creation more expensive.

A popular option to avoid excessive string creation is using tables. For example, if you have to assemble a long string, create a table, put the individual strings in there and then use table.concat to join it once

-- do NOT do something like this local ret = ""; for i=1, C do   ret = ret..foo(); end 

If foo() would return only the character A, this loop would create a series of strings like "", "A", "AA", "AAA", etc. Each string would be hashed and reside in memory until the application finishes -- see the problem here?

-- this is a lot faster local ret = {}; for i=1, C do   ret[#ret+1] = foo(); end ret = table.concat(ret); 

This method does not create strings at all during the loop, the string is created in the function foo and only references are copied into the table. Afterwards, concat creates a second string "AAAAAA..." (depending on how large C is). Note that you could use i instead of #ret+1 but often you don't have such a useful loop and you won't have an iterator variable you can use.

Another trick I found somewhere on lua-users.org is to use gsub if you have to parse a string

some_string:gsub(".", function(m)   return "A"; end); 

This looks odd at first, the benefit is that gsub creates a string "at once" in C which is only hashed after it is passed back to lua when gsub returns. This avoids table creation, but possibly has more function overhead (not if you call foo() anyway, but if foo() is actually an expression)

Avoid function overhead

Use language constructs instead of functions where possible

function ipairs

When iterating a table, the function overhead from ipairs does not justify it's use. To iterate a table, instead use

for k=1, #tbl do local v = tbl[k]; 

It does exactly the same without the function call overhead (pairs actually returns another function which is then called for every element in the table while #tbl is only evaluated once). It's a lot faster, even if you need the value. And if you don't...

Note for Lua 5.2: In 5.2 you can actually define a __ipairs field in the metatable, which does make ipairs useful in some cases. However, Lua 5.2 also makes the __len field work for tables, so you might still prefer the above code to ipairs as then the __len metamethod is only called once, while for ipairs you would get an additional function call per iteration.

functions table.insert, table.remove

Simple uses of table.insert and table.remove can be replaced by using the # operator instead. Basically this is for simple push and pop operations. Here are some examples:

table.insert(foo, bar); -- does the same as foo[#foo+1] = bar;  local x = table.remove(foo); -- does the same as local x = foo[#foo]; foo[#foo] = nil; 

For shifts (eg. table.remove(foo, 1)), and if ending up with a sparse table is not desirable, it is of course still better to use the table functions.

Use tables for SQL-IN alike compares

You might - or might not - have decisions in your code like the following

if a == "C" or a == "D" or a == "E" or a == "F" then    ... end 

Now this is a perfectly valid case, however (from my own testing) starting with 4 comparisons and excluding table generation, this is actually faster:

local compares = { C = true, D = true, E = true, F = true }; if compares[a] then    ... end 

And since hash tables have constant look up time, the performance gain increases with every additional comparison. On the other hand if "most of the time" one or two comparisons match, you might be better off with the Boolean way or a combination.

Avoid frequent table creation

This is discussed thoroughly in Lua Performance Tips. Basically the problem is that Lua allocates your table on demand and doing it this way will actually take more time than cleaning it's content and filling it again.

However, this is a bit of a problem, since Lua itself does not provide a method for removing all elements from a table, and pairs() is not the performance beast itself. I have not done any performance testing on this problem myself yet.

If you can, define a C function that clears a table, this should be a good solution for table reuse.

Avoid doing the same over and over

This is the biggest problem, I think. While a compiler in a non-interpreted language can easily optimize away a lot of redundancies, Lua will not.

Memoize

Using tables this can be done quite easily in Lua. For single-argument functions you can even replace them with a table and __index metamethod. Even though this destroys transparancy, performance is better on cached values due to one less function call.

Here is an implementation of memoization for a single argument using a metatable. (Important: This variant does not support a nil value argument, but is pretty damn fast for existing values.)

function tmemoize(func)     return setmetatable({}, {         __index = function(self, k)             local v = func(k);             self[k] = v             return v;         end     }); end -- usage (does not support nil values!) local mf = tmemoize(myfunc); local v  = mf[x]; 

You could actually modify this pattern for multiple input values

Partial application

The idea is similar to memoization, which is to "cache" results. But here instead of caching the results of the function, you would cache intermediate values by putting their calculation in a constructor function that defines the calculation function in it's block. In reality I would just call it clever use of closures.

-- Normal function function foo(a, b, x)     return cheaper_expression(expensive_expression(a,b), x); end -- foo(a,b,x1); -- foo(a,b,x2); -- ...  -- Partial application function foo(a, b)     local C = expensive_expression(a,b);     return function(x)         return cheaper_expression(C, x);     end end -- local f = foo(a,b); -- f(x1); -- f(x2); -- ... 

This way it is possible to easily create flexible functions that cache some of their work without too much impact on program flow.

An extreme variant of this would be Currying, but that is actually more a way to mimic functional programming than anything else.

Here is a more extensive ("real world") example with some code omissions, otherwise it would easily take up the whole page here (namely get_color_values actually does a lot of value checking and recognizes accepts mixed values)

function LinearColorBlender(col_from, col_to)     local cfr, cfg, cfb, cfa = get_color_values(col_from);     local ctr, ctg, ctb, cta = get_color_values(col_to);     local cdr, cdg, cdb, cda = ctr-cfr, ctg-cfg, ctb-cfb, cta-cfa;     if not cfr or not ctr then         error("One of given arguments is not a color.");     end      return function(pos)         if type(pos) ~= "number" then             error("arg1 (pos) must be in range 0..1");         end         if pos < 0 then pos = 0; end;         if pos > 1 then pos = 1; end;         return cfr + cdr*pos, cfg + cdg*pos, cfb + cdb*pos, cfa + cda*pos;     end end -- Call  local blender = LinearColorBlender({1,1,1,1},{0,0,0,1}); object:SetColor(blender(0.1)); object:SetColor(blender(0.3)); object:SetColor(blender(0.7)); 

You can see that once the blender was created, the function only has to sanity-check a single value instead of up to eight. I even extracted the difference calculation, though it probably does not improve a lot, I hope it shows what this pattern tries to achieve.

like image 191
dualed Avatar answered Oct 11 '22 22:10

dualed


If your lua program is really too slow, use the Lua profiler and clean up expensive stuff or migrate to C. But if you're not sitting there waiting, your time is wasted.

The first law of optimization: Don't.

I'd love to see a problem where you have a choice between ipairs and pairs and can measure the effect of the difference.

The one easy piece of low-hanging fruit is to remember to use local variables within each module. It's general not worth doing stuff like

 local strfind = string.find 

unless you can find a measurement telling you otherwise.

like image 41
Norman Ramsey Avatar answered Oct 11 '22 20:10

Norman Ramsey