Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I deep-compare 2 Lua tables, which may or may not have tables as keys?

(Also posted on the Lua mailing list)

So I've been writing deep-copy algorithms, and I wanna test them to see if they work the way I want them to. While I do have access to the original->copy map, I want a general-purpose deep-compare algorithm that must be able to compare table keys (tables as keys?).

My deep-copy algorithm(s) are avaliable here: https://gist.github.com/SoniEx2/fc5d3614614e4e3fe131 (it's not very organized, but there are 3 of them, one uses recursive calls, the other uses a todo table, and the other simulates a call stack (in a very ugly but 5.1-compatible way))

Recursive version:

local function deep(inp,copies)
  if type(inp) ~= "table" then
    return inp
  end
  local out = {}
  copies = (type(copies) == "table") and copies or {}
  copies[inp] = out -- use normal assignment so we use copies' metatable (if any)
  for key,value in next,inp do -- skip metatables by using next directly
    -- we want a copy of the key and the value
    -- if one is not available on the copies table, we have to make one
    -- we can't do normal assignment here because metatabled copies tables might set metatables

    -- out[copies[key] or deep(key,copies)]=copies[value] or deep(value,copies)
    rawset(out,copies[key] or deep(key,copies),copies[value] or deep(value,copies))
  end
  return out
end

Edit: I found things like this which don't really handle tables as keys: http://snippets.luacode.org/snippets/Deep_Comparison_of_Two_Values_3 (Copy of snippet below)

function deepcompare(t1,t2,ignore_mt)
  local ty1 = type(t1)
  local ty2 = type(t2)
  if ty1 ~= ty2 then return false end
  -- non-table types can be directly compared
  if ty1 ~= 'table' and ty2 ~= 'table' then return t1 == t2 end
  -- as well as tables which have the metamethod __eq
  local mt = getmetatable(t1)
  if not ignore_mt and mt and mt.__eq then return t1 == t2 end
  for k1,v1 in pairs(t1) do
    local v2 = t2[k1]
    if v2 == nil or not deepcompare(v1,v2) then return false end
  end
  for k2,v2 in pairs(t2) do
    local v1 = t1[k2]
    if v1 == nil or not deepcompare(v1,v2) then return false end
  end
  return true
end

Serializing is also not an option, as order of serialization is "random".

like image 303
SoniEx2 Avatar asked Sep 18 '14 21:09

SoniEx2


2 Answers

As others said, that depends a lot on your definition of equivalence. If you want this to be true:

local t1 = {[{}] = {1}, [{}] = {2}}
local t2 = {[{}] = {1}, [{}] = {2}}
assert( table_eq(t1, t2) )

If you do, then each time the key in t1 is a table, you'll have to check its equivalence with every table key in t2 and try them one by one. This is a way to do it (metatable stuff left out for readability).

function table_eq(table1, table2)
   local avoid_loops = {}
   local function recurse(t1, t2)
      -- compare value types
      if type(t1) ~= type(t2) then return false end
      -- Base case: compare simple values
      if type(t1) ~= "table" then return t1 == t2 end
      -- Now, on to tables.
      -- First, let's avoid looping forever.
      if avoid_loops[t1] then return avoid_loops[t1] == t2 end
      avoid_loops[t1] = t2
      -- Copy keys from t2
      local t2keys = {}
      local t2tablekeys = {}
      for k, _ in pairs(t2) do
         if type(k) == "table" then table.insert(t2tablekeys, k) end
         t2keys[k] = true
      end
      -- Let's iterate keys from t1
      for k1, v1 in pairs(t1) do
         local v2 = t2[k1]
         if type(k1) == "table" then
            -- if key is a table, we need to find an equivalent one.
            local ok = false
            for i, tk in ipairs(t2tablekeys) do
               if table_eq(k1, tk) and recurse(v1, t2[tk]) then
                  table.remove(t2tablekeys, i)
                  t2keys[tk] = nil
                  ok = true
                  break
               end
            end
            if not ok then return false end
         else
            -- t1 has a key which t2 doesn't have, fail.
            if v2 == nil then return false end
            t2keys[k1] = nil
            if not recurse(v1, v2) then return false end
         end
      end
      -- if t2 has a key which t1 doesn't have, fail.
      if next(t2keys) then return false end
      return true
   end
   return recurse(table1, table2)
end

assert( table_eq({}, {}) )
assert( table_eq({1,2,3}, {1,2,3}) )
assert( table_eq({1,2,3, foo = "fighters"}, {["foo"] = "fighters", 1,2,3}) )
assert( table_eq({{{}}}, {{{}}}) )
assert( table_eq({[{}] = {1}, [{}] = {2}}, {[{}] = {1}, [{}] = {2}}) )
assert( table_eq({a = 1, [{}] = {}}, {[{}] = {}, a = 1}) )
assert( table_eq({a = 1, [{}] = {1}, [{}] = {2}}, {[{}] = {2}, a = 1, [{}] = {1}}) )

assert( not table_eq({1,2,3,4}, {1,2,3}) )
assert( not table_eq({1,2,3, foo = "fighters"}, {["foo"] = "bar", 1,2,3}) )
assert( not table_eq({{{}}}, {{{{}}}}) )
assert( not table_eq({[{}] = {1}, [{}] = {2}}, {[{}] = {1}, [{}] = {2}, [{}] = {3}}) )
assert( not table_eq({[{}] = {1}, [{}] = {2}}, {[{}] = {1}, [{}] = {3}}) )
like image 79
Hisham H M Avatar answered Nov 09 '22 15:11

Hisham H M


Instead of direct comparison you may try to serialize each of the tables and compare the serialized results. There are several serializers that handle table as keys and can serialize deep and recursive structures. For example, you may try Serpent (available as a standalone module and also included in Mobdebug):

local dump = pcall(require, 'serpent') and require('serpent').dump
  or pcall(require, 'mobdebug') and require('mobdebug').dump
  or error("can't find serpent or mobdebug")
local a = dump({a = 1, [{}] = {}})
local b = dump({[{}] = {}, a = 1})
print(a==b)

This returns true for me (both Lua 5.1 and Lua 5.2). One of the caveats is that the serialization result depends on the order in which keys are serialized. The tables as key by default are sorted based on their tostring value, which means that the order depends on their allocation address and it's not difficult to come up with an example that fails under Lua 5.2:

local dump = pcall(require, 'serpent') and require('serpent').dump
  or pcall(require, 'mobdebug') and require('mobdebug').dump
  or error("can't find serpent or mobdebug")
local a = dump({a = 1, [{}] = {1}, [{}] = {2}})
local b = dump({[{}] = {2}, a = 1, [{}] = {1}})
print(a==b) --<-- `false` under Lua 5.2

One way to protect against this is to consistently represent tables in keys comparison; for example, instead of default tostring, you may want to serialize tables and their values and sort the keys based on that (serpent allows a custom handler for sortkeys), which would make the sorting more robust and the serialized results more stable.

like image 1
Paul Kulchenko Avatar answered Nov 09 '22 14:11

Paul Kulchenko