I'm coding some J bindings in Python (https://gist.github.com/Synthetica9/73def2ec09d6ac491c98). However, I've run across a problem in handling arbitrary-precicion integers: the output doesn't make any sense. It's something different everytime (but in the same general magnitude). The relevant piece of code:
def JTypes(desc, master):
newdesc = [item.contents.value for item in desc]
type = newdesc[0]
if debug: print type
rank = newdesc[1]
shape = ct.c_int.from_address(newdesc[2]).value
adress = newdesc[3]
#string
if type == 2:
charlist = (ct.c_char.from_address(adress+i) for i in range(shape))
return "".join((i.value for i in charlist))
#integer
if type == 4:
return ct.c_int.from_address(adress).value
#arb-price int
if type == 64:
return ct.c_int.from_address(adress).value
and
class J(object):
def __init__(self):
self.JDll = ct.cdll.LoadLibrary(os.path.join(jDir, "j.dll"))
self.JProc = self.JDll.JInit()
def __call__(self, code):
#Exec code, I suppose.
self.JDll.JDo(self.JProc, "tmp=:"+code)
return JTypes(self.deepvar("tmp"),self)
Any help would be apreciated.
Short answer: J's extended precision integers are stored in base 10,000.
More specifically: A single extended integer is stored as an array of machine integers, each in the range [0,1e4). Thus, an array of extended integers is stored as a recursive data structure. The array of extended integers has type=64 ("extended integer"), and its elements, each itself (a pointer to) an array, have type=4 ("integer").
So, conceptually (using J notation), the array of large integers:
123456 7890123 456789012x
is stored as a nested array of machine integers, each less than 10,000:
1e4 #.^:_1&.> 123456 7890123 456789012x
+-------+-------+-----------+
|12 3456|789 123|4 5678 9012|
+-------+-------+-----------+
So, to recover the original large numbers, you'd have to interpret these digits¹ in base 10,000:
10000x #.&> 12 3456 ; 789 123 ; 4 5678 9012
123456 7890123 456789012
The only other 'x-type variables' in J are rational numbers, which, unsurprisingly, are stored as pairs of extended precision integers (one for the numerator, the other for the denominator). So if you have an array whose header indicates type='rational' and count=3, its data segment will have 6 elements (2*3). Take these pairwise and you have your array of ratios.
If you're trying to build a complete J-Python interface, you'll also have to handle boxed and sparse arrays, which are similarly nested. You can learn a lot by inspecting the binary and hexadecimal representations of J nouns using the tools built in to J.
Oh, and if you're wondering why J stores bignums in base 10,000? It's because 10,000 is big enough to keep the nested arrays compact, and a power-of-10 representation makes it easy to format numbers in decimal.
¹ Take care to adjust for byte order (e.g. 4 5678 9012
may be represented in memory as 9012 5678 4
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With