Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Binary fields encoding/serialization format in a proprietary XML file (Roche LC480 .ixo file)

I recently received an example export file generated by the Roche LightCycler 480 instrument. It uses a proprietary XML format, for which I haven't found a specification yet.

From such types of files, I would like to extract some information relevant to my purposes. Although most of it can be easily parsed and interpreted, it contains a number of (unpadded) base 64 encoded fields of binary/serialized data representing arrays of integer and/or floating point numbers. A link to the example file can be found in this gist.

I have included some fragment of it at the end of this post. The AcquisitionTable contains a total of 19 such encoded item entries. That likely represent arrays of integer (SampleNo) and floating point (Fluor1) values.

How the decoded bytes are to be translated to integer or floating point values is still unclear to me. When base 64 decoding, each of the items starts with the following (hex) 6 byte sequence:

42 41 52 5A 00 00 ...    // ['B','A','R','Z','\0','\0', ...]

Note that while it is my expectation that each 'item' contains the same amount of numbers (or "rows" in this table), I am observing a different number of decoded bytes for similar items: 5654 for Fluor1 and 5530 for Fluor2.

Additionally for those arrays which I suspect contain (sequential) integers, a pattern can be observed:

SampleNo : ... 1F F5 1F 07 2F 19 2F 2B 2F 3D 2F 4F 2F 61 2F 00 73 2F 85 2F 97 2F A9 2F BB 2F CD 2F DF 2F F1 2F 00 03 3F 15 3F 27 ...
Cycles   : ... 1F FF 1F 11 2F 23 2F 35 2F 47 2F 59 2F 6B 2F 00 7D 2F 8F 2F A1 2F B3 2F C5 2F D7 2F E9 2F FB 2F 00 0D 3F 1F 3F 31 ...
Gain     : ... 1F EE 1F 00 2F 12 2F 24 2F 36 2F 00 48 2F 5A 2F 6C 2F 7E 2F 90 2F A2 2F B4 2F C6 2F 00 D8 2F EA 2F FC 2F 0E 3F 20 3F 32 ...

It looks like pairs of bytes, where the second byte is increasing by 0x12 (18) and occasionally a group of 3 bytes with 0x00 as the second byte in case the last byte's nibble is 3, D or 8 for the three examples respectively.

I was wondering if the type of encoding/serialization format would be obvious to anyone (or, even better, if someone has a specification of this file format).

I believe the software used to create these files is currently Java based, but has a history as a Windows/MFC/C++ product.

<obj name="AcquisitionTable" class="AcquisitionTable" version="1">
    <prop name="Count">2400</prop>
    <prop name="ChannelCount">6</prop>
    <list name="Columns" count="19">
        <item name="SampleNo">QkFSWgAABHgCAER0Cu3xAe3wAuv//f8PDyEPADMPRQ9XD2kPew+ND58PsQ8Aww/VD+cP+Q8LHx0fLx9BHwBTH2Ufdx+JH5sfrR+/H9EfAOMf9R8HLxkvKy89L08vYS8Acy+FL5cvqS+7L80v3y/xLwADPxU/Jz85P0s/XT9vP4E/AJM/pT+3P8k/2z/tP/8/EU8AI081T0dPWU9rT31Pj0+hTwCzT8VP10/pT/tPDV8fXzFfAENfVV9nX3lfi1+dX69fwV8A01/lX/dfCW8bby1vP29RbwBjb3Vvh2+Zb6tvvW/Pb+FvAPNvBX8Xfyl/O39Nf19/cX8Ag3+Vf6d/uX/Lf91/738BjwATjyWPN49Jj1uPbY9/j5GPAKOPtY/Hj9mP64/9jw+fIZ8AM59Fn1efaZ97n42fn5+xnwDDn9Wf55/5nwuvHa8vr0GvAFOvZa93r4mvm6+tr7+v0a8A46/1rwe/Gb8rvz2/T79hvwBzv4W/l7+pv7u/zb/fv/G/AAPPFc8nzznPS89dz2/Pgc8Ak8+lz7fPyc/bz+3P/88R3wAj3zXfR99Z32vffd+P36HfALPfxd/X3+nf+98N7x/vMe8AQ+9V72fvee+L753vr+/B7wDT7+Xv9+8J/xv/Lf8//1H/AGP/df+H/5n/q/+9/8//4f8A8/8FDxcPKQ87D00PXw9xDwCDD5UPpw+5D8sP3Q/vDwEfABMfJR83H0kfWx9tH38fkR8Aox+1H8cf2R/rH/0fDy8hLwAzL0UvVy9pL3svjS+fL7EvAMMv1S/nL/kvCz8dPy8/QT8AUz9lP3c/iT+bP60/vz/RPwDjP/U/B08ZTytPPU9PT2FPAHNPhU+XT6lPu0/NT99P8U8AA18VXydfOV9LX11fb1+BXwCTX6Vft1/JX9tf7V//XxFvACNvNW9Hb1lva299b49voW8As2/Fb9dv6W/7bw1/H38xfwBDf1V/Z395f4t/nX+vf8F/ANN/5X/3fwmPG48tjz+PUY8AY491j4ePmY+rj72Pz4/hjwDzjwWfF58pnzufTZ9fn3GfAIOflZ+nn7mfy5/dn++fAa8AE68lrzevSa9br22vf6+RrwCjr7Wvx6/Zr+uv/a8PvyG/ADO/Rb9Xv2m/e7+Nv5+/sb8Aw7/Vv+e/+b8Lzx3PL89BzwBTz2XPd8+Jz5vPrc+/z9HPAOPP9c8H3xnfK98930/fYd8Ac9+F35ffqd+7383f39/x3wAD7xXvJ+8570vvXe9v74HvAJPvpe+378nv2+/t7//vEf8AI/81/0f/Wf9r/33/j/+h/wCz/8X/1//p//v/DQ8fDzEPAEMPVQ9nD3kPiw+dD68PwQ8A0w/lD/cPCR8bHy0fPx9RHwBjH3Ufhx+ZH6sfvR/PH+EfAPMfBS8XLykvOy9NL18vcS8Agy+VL6cvuS/LL90v7y8BPwATPyU/Nz9JP1s/bT9/P5E/AKM/tT/HP9k/6z/9Pw9PIU8AM09FT1dPaU97T41Pn0+xTwDDT9VP50/5TwtfHV8vX0FfAFNc</item>
        <item name="ProgramNo">QkFSWgAABHMCAERvANz///8RDyMPNQ9HD1kPaw8AfQ+PD6EPsw/FD9cP6Q/7DwANHx8fMR9DH1UfZx95H4sfAJ0frx/BH9Mf5R/3HwkvGy8ALS8/L1EvYy91L4cvmS+rLwC9L88v4S/zLwU/Fz8pPzs/AE0/Xz9xP4M/lT+nP7k/yz8A3T/vPwFPE08lTzdPSU9bTwBtT39PkU+jT7VPx0/ZT+tPAP1PD18hXzNfRV9XX2lfe18AjV+fX7Ffw1/VX+df+V8LbwAdby9vQW9Tb2Vvd2+Jb5tvAK1vv2/Rb+Nv9W8Hfxl/K38APX9Pf2F/c3+Ff5d/qX+7fwDNf99/8X8DjxWPJ485j0uPAF2Pb4+Bj5OPpY+3j8mP248A7Y//jxGfI581n0efWZ9rnwB9n4+foZ+zn8Wf15/pn/ufAA2vH68xr0OvVa9nr3mvi68Ana+vr8Gv06/lr/evCb8bvwAtvz+/Ub9jv3W/h7+Zv6u/AL2/z7/hv/O/Bc8XzynPO88ATc9fz3HPg8+Vz6fPuc/LzwDdz+/PAd8T3yXfN99J31vfAG3ff9+R36Pftd/H39nf698A/d8P7yHvM+9F71fvae977wCN75/vse/D79Xv5+/57wv/AB3/L/9B/1P/Zf93/4n/m/8Arf+//9H/4//1/wcPGQ8rDwA9D08PYQ9zD4UPlw+pD7sPAM0P3w/xDwMfFR8nHzkfSx8AXR9vH4Efkx+lH7cfyR/bHwDtH/8fES8jLzUvRy9ZL2svAH0vjy+hL7MvxS/XL+kv+y8ADT8fPzE/Qz9VP2c/eT+LPwCdP68/wT/TP+U/9z8JTxtPAC1PP09RT2NPdU+HT5lPq08AvU/PT+FP808FXxdfKV87XwBNX19fcV+DX5Vfp1+5X8tfAN1f718BbxNvJW83b0lvW28AbW9/b5Fvo2+1b8dv2W/rbwD9bw9/IX8zf0V/V39pf3t/AI1/n3+xf8N/1X/nf/l/C48AHY8vj0GPU49lj3ePiY+bjwCtj7+P0Y/jj/WPB58ZnyufAD2fT59hn3OfhZ+Xn6mfu58AzZ/fn/GfA68VryevOa9LrwBdr2+vga+Tr6Wvt6/Jr9uvAO2v/68RvyO/Nb9Hv1m/a78Afb+Pv6G/s7/Fv9e/6b/7vwANzx/PMc9Dz1XPZ895z4vPAJ3Pr8/Bz9PP5c/3zwnfG98ALd8/31HfY99134ffmd+r3wC938/f4d/z3wXvF+8p7zvvAE3vX+9x74Pvle+n77nvy+8A3e/v7wH/E/8l/zf/Sf9b/wBt/3//kf+j/7X/x//Z/+v/AP3/Dw8hDzMPRQ9XD2kPew8AjQ+fD7EPww/VD+cP+Q8LHwAdHy8fQR9TH2Ufdx+JH5sfAK0fvx/RH+Mf9R8HLxkvKy8APS9PL2Evcy+FL5cvqS+7LwDNL98v8S8DPxU/Jz85P0s/AF0/bz+BP5M/pT+3P8k/2z8A7T//PxFPI081T0dPWU9rTwB9T49PoU+zT8VP10/pT/tPAA1fH18xX0NfVV9nUwA</item>

... snipped

        <item name="Fluor1">QkFSWgAAFg0CAFYJ+xwg7vGsP1qIWb738CFAHegc//CsnT/u9cqGyQ8A/PcbfVgeAas/qoOpJwDu/P9SgVE/ACFAHmuwHUcArR0GwX9WAWYUD2l9bgFcD9l7hgFmdA9Jep4BLA8pd7YBZqQPmXXOAYwP0XTmAWa8D0Fz/gHsD7FxFhFmBB9Zby4RHB8BbUYRZjQfqWpeEUwfGWl2EWZkH8FmjhHUD2lkphFmfB8RYr4RlB+5X9YRZqwf8V7uEdwfmVwGIWb0H0FaHiEML+lXNiFmxB+RVU4hPC9xUmYhZiQv4VB+IVQviU6WIWZsL2lLriGcL0lIxiFmtC+5Rt4hzC+ZQ/YhMoQvQQ4y/C8hPiYy/f4zkTw+MRQ/cTlWMUQ/M+E3bjFcP4k1hjF0PzMxM54xLD/ZMLYxpD8zSS/OMYw/KSzmMbw/M5kq/jHsP0EoFkHUPzPpJS5BHE+RI0ZBBE8zASJeQUxPqR92QWRPMxkejkF8TzEapkGUT5p3Ehk0TxEX1kFED/GZE+5BrE+ZEQZRxE9BmQ8eUfRPsQ02USRfWZkLTlE8X3EHZlEMXxmZBX5RbF+JA5ZRhF/BuQKuUZxfof+gx1AgxsVP/hDfUMxXrgb6KJz3UMxfmfiYD2D8X7Fz9LAnYBRvIfMgP2HmTU/wAFdgLG9x7nCcb2Bcb6ntqIdgRG+Jc+qIn2B0bzHoMLdgzqRvoeagz2CMb0nkOUjnYNRv8eHw/2Dsb+eZ35gXcLxvCd4InC9wHH/p2uhHcDR/kXPYkF9wTH9x1XB3cJkgRQZ2JtPgj3Bkf8Gb0MCncCBA7vW+Vs05oL9wlH8RzBDXcMR/5ynIKO9wBH+ZxpicB4Dcf3nDeB+A9H8hg8EgN4EtbzeE7vXu9TlzvThngDyP4brgf4DOJI+JuIiXgGyP+bY5+K+AhI+htKDHgLSP50mySN+AnI/xr/Cc94Dkj9Gs0A+QVI95c6p4J5Csf+mo6D+QzvyPyaXIV5BEn3GjOXBvkCyf4aHgh5DMj+fBnsCfkHSfMZ0wnLeQjJ9JmUjPkBSfuXOXuOeQXJ9hlWD/kM7snwmTCBegpJ95kTl4L6C8nyGPIEehFQ7nAYwAX6BMr6mJqJx3oByvGYgYj6Bkr8FzhcCnoJSvaYNov6DOrK9JgEjXoNSf8H3M7qF8r2B8BrHErwh6zB6xDL+wdzaxJL8gdsxOshUOyHNmsTSvcHHMfrFUvxhvlrFsv8BszK6xnL+gacax3K8QaMzesYS/8GT2seS/mGLMDsH8v0BgJsG0v+hdzD7BFM/IWlbB9K9wWMxuwUTPqFeGwXTPiFTMnsGMzzBStsEsz2hRzM7BpM9ITubBvM+4TMz+wVzPYEoW0QTfCEjMLtEc33hGRtE031hDzF7R1M8AQXbRZN+oPsyO0UzfUDym0XzfwDrMvtGs32g41tHE30g1zO7RlN8oMgbh3N/QL8we4QzveC024STvICtkTuJl3yhm4fTfcCZ+4WZU7xgkluGE74giruFmnO8wIMbh7M8QHd7hZmzvuBr24eTvYBgO8WbM79AWJvEU/7ATPvFmtO+QEFbxRP9wDW7xZlz/4AuG8XT/wAie8cj874b/oNQKzvG8/zAHzObx1P/YBP7x7P+AAlwWAQQPoPufLwAfHQ+sLw8Bs/VfXwAfrX6w+viB8EwPAOz/6/+Z63wHdrbqb6cAZA+gc+KfvwCsD4Dff9cAznwPQNk/7wDcD5DUOY8HEPQP4M/fHxDED+cwyy83EAwfgMZ/nE8QPB9gw19nEJQPsPO+r38QVB8Auv+5O/+5hB9QtU+vEJwf5xCvD8cQtB/QqM/s3xAkGJAa7xCqP7CTpa/3EMwfAMZS/B/gc53fJyAULzCZLz8gzmwf8JLvVyAsL9CPOc9vIFwvkImPhyB0L+fghN+fIIwvMIAvmLcgJBdGVQ99ziGkL1+ZeOYh1C8fcv4irX7fmWsWMewvn2UuMQQ/f5liRjEcP89dXjFMPx+ZWXYyZa7fUo4xfD8vmU6mMZQ/70e+MTQ/P5lD1jGsPx9A7jN2bW+ZOwZB9D+/Nh5BDE/vmS42QSRPPypOQTxPj5klZkFUT28ifkHEPy+ZHJZBbE/vFa5BhE8/mRHGQZxP/wreQcxPv5kE9kHkHw8ADlG0T1/r+14nUB6tfh/1Hsw/UW1P8G5XUCxfL+o5Lm9QXF+f6J6HUERYzu8Uz+DOn1B0Xz/fOT63UERf/9j+z1CkX+dP1E7nUNRfn8+eHP9Q7F/vyu4XYLxX3mXnz8fOL2C8X6/ErpxHYDRvb75uX2BMb99zvN53YBxvD7UOj2DOBG9fsF6nYJRvH6o5Hr9hZa5Pok7XYIxf5y+fLu9grG/vmO6cB3DEb6+Srh9wDH9vc4xuN3Akf9+K3k9wzjx/n4SeZ3B8b15+zH5x9G+ueZZxhH/+dMyucZx/3nHGcbR/DmrM3nFUf85j9nHkf65gzA6BzH/+WyaBFI9OV8w+gfx/LlRWgSyPzkrMboFcj65HhoF0j45EzJ6BjI9OPraBRI8OOMzOgaSPXjPmgWx/ri7M/oG8j24oFpHUj94mzC6RHJ8OH0aRNJ9+HcxekQSfXhp2kUyfrhWYjpF8mO8U/hCmkXyfvpkKvpGUn34E1pGsn165Ae6R3J+u/K0HoB3WBZ/2bR+gHTWf7Q0cN6AMr37rfU+gPKjvFOc+5T1noDyvjuCNnH+gJK/e292XoISvvnPYva+gbK9+0n3HoHKcr15eMsyvHskd96HOrQ5uxG0PsLSvvr85vSew/K9+uX0/sBS/5z6zPVewRL+Oro2cb7Bcv76mvYewLL8Oc6INn7B0v16dXbewzoy/jpWNz7Ckv96QOd3nsNS/noqd/7Dsv0dehF0XwLy/X5J+HM8zbXlGwTTPLXNewUzPM31udsFkzz1ojsF8z5MdZabCNc9gvsEEz00xXdbBrMjvFA1X7sGszzPNUAbR9M89Tx7RxM8z/Ug20QzfTURO0TzfmZ1l0lTf7Tp+0dzPzZk3ltFs3x0zrtKtzy7MxtG0370p3tEk3w0lzPbRhN/NHg7hzN+tG8wm4eTfbRU+4RTvnQ3MVuFE7+0IbuFc760CXIbhdO9t/Gyf4BzN33P3vLfgpO8N8wzP4By86O8UPes85+C8724TztTvTeBMF/Ds753bOZwv8AT/XdVcR/Ac/9fN08xf8Bw1Do3NcYx38GT47xTdyNyP8M5M/y3ELKfwZP99vzl8v/CU/826zNfwrP/njbSM7/DE/22xbJwHAPT/LassHwAMD11zo1w3ADT/HZ0cTwHOrQ7dltxnAFQPvZM5vH8AbA+dkJyXAIQP512KXK8AnA89hzycxwA8D/2A/N8AzA8smXn2Adz/fHQOEbQPPBluJhEUH4xpPhEsF74lM0xjVhEsH3xbbhFcHzNcWIYRdB+sU54RjB8z3Eu2EaQfLEfOEbwfk+xA5hHUHzz+Enz/jJk3FiHsH2w0LiEcL0yZMUYhNC98KV4hTC9cYSZ2IWQvviPkD/wcpiE2fC9MGL4hHEB+VcYWmRXWIZQvzAvuIawvrJkIBjHcL2wCHjH0Lyxr/Cs3MBtlL/RbTzA4PD9OP4BT/Or7fzBsP+dM5kuXMIQ/LOMrnK8wnD/s3OvHMLQ/PHPYO98wzD+M04v3MM7kP0zNSw9A/D+cyHWbJ0AbxS58xXs/QJwUSM+UrL2rV0BET/xzuPtvQFxPLLErh0CdG1V+DK4Ln0B0T8xzp8u3QBRPjKGLz0DOpE9snmvnQIxPvJk5u/9AvE8MlQsXUARf51yQWy9QHF88jTuYR1AsSJ8lhvtfUDRfvDOAu3dQZF8LfI5R1E8zW3emUXxfG3G+Uu1eM2ts1lHEXytm7lHcXzN7YQZh9F87Wx5hDG8zi1Y2YSRv21FOYSxPM5tLZmE8b8tDfmFsbzM7QpZhhG/7O65hnG8zSzfGYbRvKzTeYZRfM+st9mHMb1ssDnHkbzOLJCZx/G/bHz5xFH8zKxtWcexP6xRucSx/MzsQhnF0f4sLnnGMf3MrArZxTF8L/wrPcM0adX/3Oudw1H+r9Tmq/3Dsf/vw+heABI/HK+kqL4AciMd15gqcR4AcjzveOl+ATI8bc9sad4HFnrvRuo+AmGSPjo7Hdc0Kv4Csj8tzxsrXgMSPG8Ia74HmxV+9ageQ3I+7uLqcH5AMn3uyejeQ9I8766w6T5AaTV7botqcZ5Akn7ufun+QPJ9bc5Zal5CEn6uRqq+Qzpyf+4z6x5Bsn0uIOUrfkLSfm4Oa95DknzPqfg6h/J/KeyahFK8z+nM+oSyvSm9WocyfM5pqbqFErwpphqFcr6M6YZ6hdK+vAlykr7qZVM6hjK9aS+ahvK8aqUX+oi2vQRaxGu2+upk7LrEEv3o1RrHUrzqZL16xVJ9qJ3axZL+6mSKOsTS/Ch6msUy/qpkUvrGUvxoT1rIlntqZDO6xfL+6CQbB3L96uQMewcS/Ov05N8AZxqWv+IlPwDzIL5RK8jlJZ8BUzyrvKX/APM/Xeup5l8AZxS7K5TnJr8CEz4rficfAtM/Mxs/IpNe598Dkz5rUOZkP0PzPWs5ZJ9AU3+eKxok/0Czf2sHZ5FfQbM8KBsJE31q1WZyH0FzfGq8Zn9FVz6c5SbfQdN8qpCnP0Lzf5+qd6efQ1N+ql6mc/9Ck3xqWGRfhTV5Kc45JL+Ds3yqLKUfgrjTvCogJX+AZ3f7KM4HJd+BM7xl9juIlnjNpeKbhfO9Jdb7hrO8zCW/W4cTv6Wvu4dzvMzlnBvH074liHvGU7zPZXTbxHO+5Wk7xDP8z6VJm8VT/OU5+8Wz/M2lGlvGE/7lBrvGc/zPpOcbxJP+pM97xnM8ziTD28cz/KScOATz/M+kgJgG0/zkcPgH8/zPZElYB5P8pDm4BRA8zWQaGASwPOQOeAYwP12n7aLcAGCWeSfg5SM8AvA8J8gjnANQIOM/l5wROXVntWBcQ7A/nqeioLxDUD/nj+JxHEBwfKdwoXxA0Hwnj2Qh3EAQYH0RZ1FicjxBkH6nPqKcQBB/5c8r4vxBMH0nGSNcQzpQfmcGY7xDEH8m5OcgHIKwf+bH4HyAML+fZrtg3INwfuau4nE8gPC95pXhnIPQfWXOiWH8gbC+JmoiXIM5UL0mUSK8gnC8JjjkIxyC0L1mJWN8gzC9nGYMY9yElnrh5DjFmJC94cyYx/C/Ibj4xZiw/GGpWMhVOSGJuMWZEPyhfhjF0P3hanjFmXD/IVbYxjD8YUc4yZiWeSEnmMdQ/CEP+MWbsP8g8FkEUP4g2LkFmHE9IMEZBBE8IKl5BZkxPWCV2QaQ/iB2OQWY0T0gXpkF8T5gSvkFmrE84CdZBZE8YBu5CrjUe2P/XB1AXlU/+NUcfUBeNP/Z3N1AkX8dY81dPUDxY75So7jmnZ1A8X4jrh39QDFjmZzXlR5dQVF943Xecr1CcXzjXN8dQtF/4c9D331DMX7jKt/dQzgxfCMYHD2BsX+jCOecnYBRvGLsXP2FtX3O5h1dg5F+4sbdvYM5cb5iul4dg/F94qzl3n2BEb+ip57dgpG/nOKU3z2C8b2idZ5znYIxvuJi3/2B0bwhzlAcXcNRvyI3HL3DOHH84jDdHcDR/aISZZ19wTH8nfnZxZH9XmXaOcXx/N3OmcZR/15lpvnEEf7dm1nGEX3eZYO5x7G83WgaBrH+HmVUegcR/R082gfR/B5lJToEMjzdBZoFUj6fMfYI8j0c2loFsj7c0zK6BnI/nLMaBJI+nJszegbSPZyD2geSPJxrMDpH8j1cSJpEUnxcMZD6SrX8FVpHcfycBbpGuRJ9X+VaHkBZtj/Rxpp+QjJjvlPfv9reQzoyf1+zWz5CknyfouSbnkLyfl+aW/5AWzg0uV+BWF6Dsn6fbOaYvoByv99b2R6DUn+e30LZfoASvd8p2nHegTK9Xx1aPoDSvN3PENqegfK9nvGa/oM6Ur9e61tegZK8ntjkm76Csr3exdgew3K+Hx6zGH7DErx61rh2Hc6aGT7Akv/ek9mewzlS/J50mf7A8v5ebOZaXsGy/x5PGr7Ccv2cXjxbHsVW/i/bfsM7Mv7eFtvewhL92f8wOweS/xnomwfy/hnTMPsEUz7ZsVsEsz5ZpzG7BRM/GYYbBXM+mXsyewXTPZli2wYzP1lbMzsGkz7ZT5sG8z+ZLKP7B1M/XEkgE32Yt0hzfZC7f+MQyXtFM33Y0dtGGZN8WK47Sfd+P32EiRpkjvtHsz7Yh1tGs35YxHu7RxNh/ZDYVBuH0rzMWEh7iDS5GCjbhJO9zJgdO4Qzvxv3FZ+CdFQ0uFvkVf+E1HgbD6wWX4Gzvlu+YxOGlHMfgtO/23PXf4ITof2TnltOV9+Dk73bQdZwP8PzvNso1J/AU/2ZzwmU/8MzvJrwlV/DORP9WtFVv8ITvNrE5NYfwdP+2pLWf8Iz/nJ71rh02mDXP8Lz/pnOWpefwpP9GjUX/8M7U/5aIlRcABA9WgrlVLwAcBxrIGT8AqWc/pX1GAez/9XheAcYcDyVxdgFkCH9ktXLMjgFkD1VppgGUDzVmzL4BTA8VY9YCNR4FbszuAawP5WoGEXwPhWHMHhH0D0VbNhEkH3VTzE4RDB+lS2YRPB9lRcx+EWwf1UOWEYQftUAMrhHcD5U9xhG0H8YfYCMzxTX2EZwfNTQOIfwfQ0UpJiEULyYvABQfViFm5B9VHm4hXC+FFoYhZkQv1RGeIYwvBQq2IWakL+UGziF0LzUC5iGuLP/V+NT/IBR9D/UxtBcwBDj/Be3kLzAcP+c16TRHMDQ/9eL0Ol8wFCVetdy0dzAEP+fV39SPMGQ/5dTkcKcwlD+mNc7tFc0U1zGYxT/XNGY+w7QHQPQ/5VO75B9A3D+loPG1JE/nRaxET0A8TyWpJJxnQAxPVaFUf0BUTzVznjSXQGxPFZsUr0HGTQ+X9MdAtEh/ZEWTOUTfQIRPBY0E90DkT+dViFQPUPxPNYU0nCdQfD+FgIQ/UBRftGF4VlFEX1ZVdm00bIZRZlxf9GWeUbRPlFy2UWaMX8RUzlF0XxRQ5lJmZd5ESP5R7F8kRRZhZgRvVD0uYbxfxDtGYWY0bzQ6XmFMb4Q1dmFmZG9kMo5hfG+UKqZhZpRv5CW+YaxvFB7WYmZdniQT7mEcb1QLBnHm9G+kBh5xxG9E/UM6N3ATZd4E9wNPcDx/16Tto2dwE8Vv7BOcf3HN7tTl05dwVH9Ew+RDr3CEf65/cKN03Dlz33Ccf8TXw/dwzHjO7+TUzNMPgOR/RMs5QyeAFI9UwFM/gCyPzD6FZm64g2+AXI/UsznTh4B0j7Sws5+ARI/nBKwDt4Ckj5StkxzPgLyPhJ+D54CMj+aP5rBUlJMXkASfVI5TnC+Q1I/0hPNHkDSfs2F+XpFMn16dPgUTb46SZs3uM3KmkWSfY2q+kWaUn7Nl1pHEn+Nd7pFm3J8TVgaize6jVx6hZgyvQ042oSSvI0tOoWasn8NBZqE8rzNAfqFmbK9DNZahhK+TMK6hZpyv4yvGoVSvMyfeoQC0pwA</item>

... snipped

        <item name="Gain6">QkFSWgAACOQCAEjgBu3z8D/u/wAPEg8kDzYPAEgPWg9sD34PkA+iD7QPxg8A2A/qD/wPDh8gHzIfRB9WHwBoH3ofjB+eH7Afwh/UH+YfAPgfCi8cLy4vQC9SL2Qvdi8AiC+aL6wvvi/QL+Iv9C8GPwAYPyo/PD9OP2A/cj+EP5Y/AKg/uj/MP94/8D8CTxRPJk8AOE9KT1xPbk+AT5JPpE+2TwDIT9pP7E/+TxBfIl80X0ZfAFhfal98X45foF+yX8Rf1l8A6F/6XwxvHm8wb0JvVG9mbwB4b4pvnG+ub8Bv0m/kb/ZvAAh/Gn8sfz5/UH9if3R/hn8AmH+qf7x/zn/gf/J/BI8WjwAojzqPTI9ej3CPgo+Uj6aPALiPyo/cj+6PAJ8SnySfNp8ASJ9an2yffp+Qn6KftJ/GnwDYn+qf/J8OryCvMq9Er1avAGiveq+Mr56vsK/Cr9Sv5q8A+K8Kvxy/Lr9Av1K/ZL92vwCIv5q/rL++v9C/4r/0vwbPABjPKs88z07PYM9yz4TPls8AqM+6z8zP3s/wzwLfFN8m3wA430rfXN9u34Dfkt+k37bfAMjf2t/s3/7fEO8i7zTvRu8AWO9q73zvju+g77LvxO/W7wDo7/rvDP8e/zD/Qv9U/2b/AHj/iv+c/67/wP/S/+T/9v8ACA8aDywPPg9QD2IPdA+GDwCYD6oPvA/OD+AP8g8EHxYfACgfOh9MH14fcB+CH5Qfph8AuB/KH9wf7h8ALxIvJC82LwBIL1ovbC9+L5Avoi+0L8YvANgv6i/8Lw4/ID8yP0Q/Vj8AaD96P4w/nj+wP8I/1D/mPwD4PwpPHE8uT0BPUk9kT3ZPAIhPmk+sT75P0E/iT/RPBl8AGF8qXzxfTl9gX3JfhF+WXwCoX7pfzF/eX/BfAm8UbyZvADhvSm9cb25vgG+Sb6Rvtm8AyG/ab+xv/m8QfyJ/NH9GfwBYf2p/fH+Of6B/sn/Ef9Z/AOh/+n8Mjx6PMI9Cj1SPZo8AeI+Kj5yPro/Aj9KP5I/2jwAInxqfLJ8+n1CfYp90n4afAJifqp+8n86f4J/ynwSvFq8AKK86r0yvXq9wr4KvlK+mrwC4r8qv3K/urwC/Er8kvza/AEi/Wr9sv36/kL+iv7S/xr8A2L/qv/y/Ds8gzzLPRM9WzwBoz3rPjM+ez7DPws/Uz+bPAPjPCt8c3y7fQN9S32Tfdt8AiN+a36zfvt/Q3+Lf9N8G7wAY7yrvPO9O72Dvcu+E75bvAKjvuu/M797v8O8C/xT/Jv8AOP9K/1z/bv+A/5L/pP+2/wDI/9r/7P/+/xAPIg80D0YPAFgPag98D44PoA+yD8QP1g8A6A/6DwwfHh8wH0IfVB9mHwB4H4ofnB+uH8Af0h/kH/YfAAgvGi8sLz4vUC9iL3Qvhi8AmC+qL7wvzi/gL/IvBD8WPwAoPzo/TD9eP3A/gj+UP6Y/ALg/yj/cP+4/AE8STyRPNk8ASE9aT2xPfk+QT6JPtE/GTwDYT+pP/E8OXyBfMl9EX1ZfAGhfel+MX55fsF/CX9Rf5l8A+F8KbxxvLm9Ab1JvZG92bwCIb5pvrG++b9Bv4m/0bwZ/ABh/Kn88f05/YH9yf4R/ln8AqH+6f8x/3n/wfwKPFI8mjwA4j0qPXI9uj4CPko+kj7aPAMiP2o/sj/6PEJ8inzSfRp8AWJ9qn3yfjp+gn7KfxJ/WnwDon/qfDK8erzCvQq9Ur2avAHiviq+cr66vwK/Sr+Sv9q8ACL8avyy/Pr9Qv2K/dL+GvwCYv6q/vL/Ov+C/8r8EzxbPACjPOs9Mz17PcM+Cz5TPps8AuM/Kz9zP7s8A3xLfJN823wBI31rfbN9+35Dfot+038bfANjf6t/83w7vIO8y70TvVu8AaO9674zvnu+w78Lv1O/m7wD47wr/HP8u/0D/Uv9k/3b/AIj/mv+s/77/0P/i//T/Bg8AGA8qDzwPTg9gD3IPhA+WDwCoD7oPzA/eD/APAh8UHyYfADgfSh9cH24fgB+SH6Qfth8AyB/aH+wf/h8QLyIvNC9GLwBYL2ovfC+OL6Avsi/EL9YvAOgv+i8MPx4/MD9CP1Q/Zj8AeD+KP5w/rj/AP9I/5D/2PwAITxpPLE8+T1BPYk90T4ZPAJhPqk+8T85P4E/yTwRfFl8AKF86X0xfXl9wX4JflF+mXwC4X8pf3F/uXwBvEm8kbzZvAEhvWm9sb35vkG+ib7Rvxm8A2G/qb/xvDn8gfzJ/RH9WfwBof3p/jH+ef7B/wn/Uf+Z/APh/Co8cjy6PQI9Sj2SPdo8AiI+aj6yPvo/Qj+KP9I8GnwAYnyqfPJ9On2Cfcp+En5afAKifup/Mn96f8J8CrxSvJq8AOK9Kr1yvbq+Ar5KvpK+2rwDIr9qv7K/+rxC/Ir80v0a/AFi/ar98v46/oL+yv8S/1r8A6L/6vwzPHs8wz0LPVM9mzwB4z4rPnM+uz8DP0s/kz/bPAAjfGt8s3z7fUN9i33Tfht8AmN+q37zfzt/g3/LfBO8W7wAo7zrvTO9e73Dvgu+U76bvALjvyu/c7+7vAP8S/yT/Nv8ASP9a/2z/fv+Q/6L/tP/G/wDY/+r//P8ODyAPMg9ED1YPAGgPeg+MD54PsA/CD9QP5g8A+A8KHxwfLh9AH1IfZB92HwCIH5ofrB++H9Af4h/0HwYvABgvKi88L04vYC9yL4Qvli8AqC+6L8wv3i/wLwI/FD8mPwA4P0o/XD9uP4A/kj+kP7Y/AMg/2j/sP/4/EE8iTzRPRk8AWE9qT3xPjk+gT7JPxE/WTwDoT/pPDF8eXzBfQl9UX2ZfAHhfil+cX65fwF/SX+Rf9l8ACG8abyxvPm9Qb2JvdG+GbwCYb6pvvG/Ob+Bv8m8EfxZ/ACh/On9Mf15/cH+Cf5R/pn8AuH/Kf9x/7n8AjxKPJI82jwBIj1qPbI9+j5CPoo+0j8aPANiP6o/8jw6fIJ8yn0SfVp8AaJ96n4yfnp+wn8Kf1J/mnwD4nwqvHK8ur0CvUq9kr3avAIivmq+sr76v0K/ioQ</item>
    </list>
</obj>
like image 453
Alex Avatar asked Oct 20 '16 17:10

Alex


1 Answers

This is what I have so far.
The document you're using is not from an actual PCR run, as inferred from the readable data. It is a color compensation run (short overview that seems to match the file) (full updated manual, page 250, not as fitting). Specifically, it seems to be a color compensation run for the "FAM/Pulsar 650" dye.
The output type, as you point out, is this "AcquisitionTable" with 2400 "counts" which must be different, I believe, from output you would normally get from a PCR run. I'm sure you've found these already, but a few public examples of PCR templates (not completed runs) are here, here, here and here.

According to the LCRunProgram in your file, the protocol here was:
hold 95°C for 0" at a speed of 20°C/s
hold 40°C for 30", 20°C/s
hold 95°C for 0" at 0.1°C/s, acquisition mode "2".

So, we're expecting that the acquisition timeframe lasted an estimated (95°C-40°C) / 0.1°C/s = 550 seconds, approximately; during which time, there should have been a fixed number of acquisition events per second.

EDIT 0 - this is what I had done at the beginning, so I'm not deleting it, but I got more interesting information later (see below).

I took a look at the data with a simple Python script (I'm a Python guy), to search for patterns. The script holds your data's initial strings in a dictionary called values which would be too long to post here; so here's it in a gist, just as you had to do.

#!/usr/bin/env python3

import base64
from collections import OrderedDict, defaultdict
from values import values

def splitme(name, sep):
    splitted = base64.b64decode(values[name]+'==').split(sep)
    print("{:<12} [{}; {}] separated in {} chunks: {}".format(
            name,
            len(values[name]), len(base64.b64decode(values[name]+'==')),
            len(splitted),
            [len(i) for i in splitted]))
    return splitted

if __name__ == '__main__':
    allchunks = defaultdict(list)
    separator = b'\r'
    print("separating by:", separator)
    for key in values:
        data = splitme(key, sep=separator)
        for i, item in enumerate(data):
            allchunks[item].append((key, i))
    print("Common chunks:")
    for location in [value for item, value in allchunks.items() if len(value)>1]:
        print(location)

Let's get the obvious out of the way and say that ProgramNo and CycleNo hold the same data; and all Gain are identical. So I'll only post one of each.

Now trying the script with the separator b'\r' (just to try for one) cuts a few of them in chunks of 272 (271+separator) bytes. The others aren't tidy.

separating by: b'\r'
SampleNo     [1536; 1152] separated in 5 chunks: [174, 271, 271, 271, 161]
ProgramNo    [1531; 1148] separated in 6 chunks: [47, 271, 271, 271, 271, 12]
SegmentNo    [1531; 1148] separated in 5 chunks: [169, 271, 271, 271, 162]

Separating by b'\t' gives similar results:

separating by: b'\t'
SampleNo     [1536; 1152] separated in 5 chunks: [204, 271, 271, 271, 131]
ProgramNo    [1531; 1148] separated in 5 chunks: [76, 271, 271, 271, 255]
SegmentNo    [1531; 1148] separated in 5 chunks: [199, 271, 271, 271, 132]

And separating by b'\n' splits the gains this time, in a similar way:

separating by: b'\n'
Gain1        [3046; 2284] separated in 10 chunks: [81, 271, 271, 271, 271, 271, 271, 271, 271, 26]  

So I am not at all implying that these "separators" are of any importance; I'm thinking that they are rare characters that appear to cut the data in 272-byte chunks, and this value, 272 bytes, might be important in understanding how this data is stored.

The beginning of each string "BARZ" seems like a "foo-bar" thing; probably set as check at the start of the header.

Another thing that is interesting is that the gains data separates into 8 equal-sized chunks (plus other two smaller blocks). If this data is from a 96-well plate, I would start exploring if this might possibly be a header and then 8 chunks (lines) which would be splittable in 12 items (colums), so that 8*12=96 replicating the setting of a 96-well plate.

Also, if this "272 bytes per line" hypothesis is true, then the data in ProgramNo, SampleNo etc that do split into 272-bytes chunks might be explained if the plate wasn't full, and some wells had samples (with a few complete lines) while others were empty. I'm not sure if this would make sense for a color compensation plate.

Time, Temperature, Error and Fluors do not separate into chunks and you are correct in thinking they are a set of continuous values; not necessarily floats though. Fluorescence can be captured as "units" which might be positive ints (I don't have a LightCycler so I don't know if it's the case or not).

And this is where I am so far. I'm not sure I'll have time to go further. In case I don't reply back, good luck with your endeavour.

EDIT 1:

So regarding the SampleNo data, it seems to be structured in this way:
1) a header, which might or might not be separated by 0x00 like:
* the BARZ header, then 2 times 0x00 (total 6 bytes)
* three bytes, then 0x00 (total 4 bytes)
* 17 bytes, then 0x00 (total 18 bytes)
2) a series of data, each of them comprised of 16 bytes and terminated by 0x00 (so 17 bytes each).
This means that Samples holds a header, plus 66 sets of 17 bytes.

EDIT 2:

Splitting everything by 0x00 with this awful piece of code:

def splitme(name):
    data = base64.b64decode(values[name]+'==')
    hit = 0
    index = 0
    countit = 0
    splits = []
    while hit >= 0 and countit < 500:
        countit += 1
        hit = data[index+1:].find(0)
        index += hit+1
        if hit >= 0:
            splits.append(index)
    lastindex = -1
    splitted = []
    if splits:
        for index in splits:
            splitted.append(data[lastindex+1:index])
            lastindex = index
    else:
        splitted = [data]

Yields:

separating by: 0x0
SampleNo     [1536; 1152] separated in 70 chunks: [4, 0, 3, 17, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16]
ProgramNo    [1531; 1148] separated in 71 chunks: [4, 0, 3, 2, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 12]
SegmentNo    [1531; 1148] separated in 69 chunks: [4, 0, 3, 18, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16]
CycleNo      [1531; 1148] separated in 71 chunks: [4, 0, 3, 2, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 12]
Time         [11944; 8958] separated in 63 chunks: [4, 0, 3, 45, 14, 42, 76, 46, 172, 110, 109, 15, 81, 90, 111, 108, 78, 46, 175, 141, 88, 209, 74, 117, 156, 170, 59, 107, 78, 103, 125, 171, 103, 170, 191, 333, 154, 187, 11, 257, 149, 208, 173, 156, 153, 412, 72, 55, 207, 131, 131, 274, 284, 238, 19, 241, 247, 13, 74, 558, 763, 8, 0]
Temperature  [6731; 5048] separated in 14 chunks: [4, 0, 3, 394, 186, 543, 177, 173, 530, 534, 371, 714, 373, 1032]
Error        [398; 298] separated in 21 chunks: [4, 0, 3, 2, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 12]
Fluor1       [7539; 5654] separated in 38 chunks: [4, 0, 3, 31, 13, 7, 7, 426, 331, 218, 187, 11, 10, 13, 7, 6, 7, 48, 45, 217, 840, 6, 7, 14, 7, 6, 7, 7, 6, 1178, 8, 6, 1147, 7, 6, 141, 630, 2]
...
Gain1        [3046; 2284] separated in 145 chunks: [4, 0, 3, 9, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16]
...

So SampleNo, ProgramNo, SegmentNo, Error and the Gains all split in blocks of 17 bytes (16 bytes + 0x00).

EDIT 3:
The first fifteen 17-bit chunks of ProgramNo (and the copy CycleNo) and Error are identical.
Just to clarify, the "chunks" I describe are what you describe as a series of number pairs, one of which increases by 0x12. The 0x00 that you mention is the separator between the chunks.

EDIT 4:
About Gain data, the link between my initial "272 bytes" blocks and the (16+0x00)-byte blocks, is that there's a repeating pattern of 16 blocks, 15 of them are "16+0x00" blocks and one last block has a 0x00 in the middle. So 17 bytes(=16+0x00) * 16 blocks = 272 bytes total for this repeat.
The whole string is built as follows: the "header" part, then 8 such repeats of 17bytes*16 blocks, and then four 17bytes blocks at the end. So on one side I was right about the 8 blocks, but apparently I was wrong when making the parallel with a 8x12 wells PCR plate. Here it's more like 8*16 (+4).
About Fluor etc. data, I don't have an answer but I'd try to strip the header and see if any (integer or float) compression algorithm can work on it... Compressed data would explain why you have different lengths for these fields.

like image 100
Roberto Avatar answered Oct 01 '22 23:10

Roberto