Status: So far the best answer's program executes in 33% of the time of the original program! But there is probably still other ways to optimize it.
Lua is currently the fastest scripting language out there, however Lua scores really bad in a few benchmarks against C/C++.
One of those is the mandelbrot test (Generate Mandelbrot set portable bitmap file N=16,000), where it scores a horrible 1:109(Multi Core) or 1:28(Single Core)
Since the Delta in speed is quite large, this is a good candidate for optimizations. Also I'm sure some that those who know who Mike Pall is might believe its not possible to optimize this any further, but that's blatantly wrong. Anyone who has done optimizations knows it is always possible to do better. Besides I did manage to get some extra performance with a few tweaks, so I know its possible :)
-- The Computer Language Shootout
-- http://shootout.alioth.debian.org/
-- contributed by Mike Pall
local width = tonumber(arg and arg[1]) or 100
local height, wscale = width, 2/width
local m, limit2 = 50, 4.0
local write, char = io.write, string.char
write("P4\n", width, " ", height, "\n")
for y=0,height-1 do
local Ci = 2*y / height - 1
for xb=0,width-1,8 do
local bits = 0
local xbb = xb+7
for x=xb,xbb < width and xbb or width-1 do
bits = bits + bits
local Zr, Zi, Zrq, Ziq = 0.0, 0.0, 0.0, 0.0
local Cr = x * wscale - 1.5
for i=1,m do
local Zri = Zr*Zi
Zr = Zrq - Ziq + Cr
Zi = Zri + Zri + Ci
Zrq = Zr*Zr
Ziq = Zi*Zi
if Zrq + Ziq > limit2 then
bits = bits + 1
break
end
end
end
if xbb >= width then
for x=width,xbb do bits = bits + bits + 1 end
end
write(char(255-bits))
end
end
So how could this be optimized (of course as with any optimization you have to measure your implementation to be sure its faster). And you aren't allowed to alter the C-core of Lua for this, or use LuaJit, its about finding ways to optimizing one of Lua's weak weak points.
Edit: Putting a Bounty on this as to make the challenge more fun.
So here's ~40% for a start:
-- The Computer Language Shootout
-- http://shootout.alioth.debian.org/
-- contributed by Mike Pall
local width = tonumber(arg and arg[1]) or 100
local height, wscale = width, 2/width
local m, limit2 = 50, 4.0
local write, char = io.write, string.char
function addChar (line, c)
table.insert(line, c)
for i=table.getn(line)-1, 1, -1 do
if string.len(line[i]) > string.len(line[i+1]) then
break
end
line[i] = line[i] .. table.remove(line)
end
end
local h2 = math.floor(height/2)
local hm = height - h2*2
local top_half = {}
for y=0,h2+hm do
local Ci = 2*y / height - 1
local line = {""}
for xb=0,width-1,8 do
local bits = 0
local xbb = xb+7
for x=xb,xbb < width and xbb or width-1 do
bits = bits + bits
local Zr, Zi, Zrq, Ziq = 0.0, 0.0, 0.0, 0.0
local Cr = x * wscale - 1.5
for i=1,m do
local Zri = Zr*Zi
Zr = Zrq - Ziq + Cr
Zi = Zri + Zri + Ci
Zrq = Zr*Zr
Ziq = Zi*Zi
if Zrq + Ziq > limit2 then
bits = bits + 1
break
end
end
end
for x=width,xbb do
bits = bits + bits + 1
end
addChar(line,char(255-bits))
end
line = table.concat(line)
table.insert(top_half,line)
end
write("P4\n", width, " ", height, "\n")
for y=1,h2+hm do
write(top_half[y])
end
for y=h2,1,-1 do
write(top_half[y])
end
-- MarkusQ
Now that there is at least one answer faster than my solution I'll post my answer
-- The Computer Language Shootout
-- http://shootout.alioth.debian.org/
-- contributed by Mike Pall
local width = tonumber(arg and arg[1]) or 100
local height, wscale = width, 2/width
local m, limit2 = 50, 4.0
local write, char = io.write, string.char
local insert = table.insert
local results={}
write("P4\n", width, " ", height, "\n")
for y=0,height-1 do
local Ci = 2*y / height - 1
for xb=0,width-1,8 do
local bits = 0
local xbb = xb+7
for x=xb,xbb < width and xbb or width-1 do
bits = bits + bits
local Zr, Zi, Zrq, Ziq = 0.0, 0.0, 0.0, 0.0
local Cr = x * wscale - 1.5
for i=1,m do
local Zri = Zr*Zi
Zr = Zrq - Ziq + Cr
Zi = Zri + Zri + Ci
Zrq = Zr*Zr
Ziq = Zi*Zi
if Zrq + Ziq > limit2 then
bits = bits + 1
break
end
end
end
if xbb >= width then
for x=width,xbb do bits = bits + bits + 1 end
end
insert(results,(char(255-bits)))
end
end
write(table.concat(results))
The trick is storing values to a table before sending them to the output. Something as simple as this reduced the run time to 58%.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With