Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I retrieve an object by id in Julia

Tags:

julia

In Julia, say I have an object_id for a variable but have forgotten its name, how can I retrieve the object using the id?

I.e. I want the inverse of some_id = object_id(some_object).

like image 213
conjectures Avatar asked May 31 '17 10:05

conjectures


1 Answers

As @DanGetz says in the comments, object_id is a hash function and is designed not to be invertible. @phg is also correct that ObjectIdDict is intended precisely for this purpose (it is documented although not discussed much in the manual):

ObjectIdDict([itr])

ObjectIdDict() constructs a hash table where the keys are (always) object identities. Unlike Dict it is not parameterized on its key and value type and thus its eltype is always Pair{Any,Any}.

See Dict for further help.

In other words, it hashes objects by === using object_id as a hash function. If you have an ObjectIdDict and you use the objects you encounter as the keys into it, then you can keep them around and recover those objects later by taking them out of the ObjectIdDict.

However, it sounds like you want to do this without the explicit ObjectIdDict just by asking which object ever created has a given object_id. If so, consider this thought experiment: if every object were always recoverable from its object_id, then the system could never discard any object, since it would always be possible for a program to ask for that object by ID. So you would never be able to collect any garbage, and the memory usage of every program would rapidly expand to use all of your RAM and disk space. This is equivalent to having a single global ObjectIdDict which you put every object ever created into. So inverting the object_id function that way would require never deallocating any objects, which means you'd need unbounded memory.

Even if we had infinite memory, there are deeper problems. What does it mean for an object to exist? In the presence of an optimizing compiler, this question doesn't have a clear-cut answer. It is often the case that an object appears, from the programmer's perspective, to be created and operated on, but in reality – i.e. from the hardware's perspective – it is never created. Consider this function which constructs a complex number and then uses it for a simple computation:

julia> function f(y::Real)
           z = Complex(0,y)
           w = 2z*im
           return real(w)
       end
f (generic function with 1 method)

julia> foo(123)
-246

From the programmer's perspective, this constructs the complex number z and then constructs 2z, then 2z*im, and finally constructs real(2z*im) and returns that value. So all of those values should be inserted into the "Great ObjectIdDict in the Sky". But are they really constructed? Here's the LLVM code for this function applied to an Int:

julia> @code_llvm foo(123)

define i64 @julia_foo_60833(i64) #0 !dbg !5 {
top:
  %1 = shl i64 %0, 1
  %2 = sub i64 0, %1
  ret i64 %2
}

No Complex values are constructed at all! Instead, all of the work is inlined and eliminated instead of actually being done. The whole computation boils down to just doubling the argument (by shifting it left one bit) and negating it (by subtracting it from zero). This optimization can be done first and foremost because the intermediate steps have no observable side effects. The compiler knows that there's no way to tell the difference between actually constructing complex values and operating on them and just doing a couple of integer ops – as long as the end result is always the same. Implicit in the idea of a "Great ObjectIdDict in the Sky" is the assumption that all objects that seem to be constructed actually are constructed and inserted into a large, permanent data structure – which is a massive side effect. So not only is recovering objects from their IDs incompatible with garbage collection, it's also incompatible with almost every conceivable program optimization.

The only other way one could conceive of inverting object_id would be to compute its inverse image on demand instead of saving objects as they are created. That would solve both the memory and optimization problems. Of course, it isn't possible since there are infinitely many possible objects but only a finite number of object IDs. You are vanishingly unlikely to actually encounter two objects with the same ID in a program, but the finiteness of the ID space means that inverting the hash function is impossible in principle since the preimage of each ID value contains an infinite number of potential objects.

I've probably refuted the possibility of an inverse object_id function far more thoroughly than necessary, but it led to some interesting thought experiments, and I hope it's been helpful – or at least thought provoking. The practical answer is that there is no way to get around explicitly stashing every object you might want to get back later in an ObjectIdDict.

like image 153
StefanKarpinski Avatar answered Oct 21 '22 19:10

StefanKarpinski