Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

binary format to pass tabular data

I'm maintaining a legacy embedded device which interacts with the real world. Generally speaking, this device collects data from sensors, processes the data using its internal algorithm, and displays warning when data reaches a certain "bad" state.

For debugging purposes, we wish this device will send us on a regular basis many of the data it receives, as well as the data after it processed it.

We reached to the conclusion that most of the data can be described in a tabular form, something along the lines of

sensor|time|temprature|moisture
------+----+----------+--------
1     |3012|20        |0.5
2     |3024|22        |0.9

We obviously need to support more than one form of table.

So basically we need a protocol that is able to accept a certain set of tables description , and then to deliver table data according to its description.

An example pseudo code for sending data is:

table_t table = select_table(SENSORS_TABLE);
sensors_table_data_t data[] = {
    {1,3012,20,0.5},
    {1,3024,22,0.9}
    };
send_data(table,data);

An example pseudo code for receiving data is:

data_t *data = recieve();
switch (data->table) {
    case SENSORS_TABLE:
         puts("sensor|time|temprature|moisture");
         for (int i=0;i<data->length;i++) printf(
             "%5s|%4s|%9s|%9s\n",
              data->cell[i]->sensor,
              data->cell[i]->time,
              data->cell[i]->temprature,
              data->cell[i]->moisture);
         break;
    case USER_INPUT_TABLE:
         ...
}

Defining the tables could be done either off line both at the device and at the client computer communicating with it, or online. We can add a simple handshake protocol to agree upon table's format at the device's boot-time.

Since this is a legacy device, it supports only RS232 communication, and since its CPU is pretty slow (equivalent to 486), we cannot afford using any XML-like data transfer methods. Those are too expensive (either computation-time-wise, or bandwidth-wise). Sending raw SQL commands was also considered and rejected due to bandwidth considerations.

[edit]

For clarification, too reduce the overhead of sending the table header each time, I'm trying to avoid sending the table header each time I'm sending data. So that each time I'm sending a table row, I'll just have to send the tables id.

I also would like to note that most of the data I wish to pass is numerical, so text-based protocols are too wasteful.

Lastly I've seen Google's protocol buffers, it's close enough but it doesn't support C.

[/edit]

Any idea about a known protocol or implementation like what I described? Any better idea to send this data?

I'm aware to the fact that this protocol is not very hard to design, I had in mind a two phase protocol:

1) Handshake: send the headers of all tables you wish to fill. Each table description would include information about the size of each column.

2) Data: send the table index (according to handshake) followed by the actual data. Data will be followed by a checksum.

But I wish to avoid the small details of such design, and use some ready-made protocol. Or even better, use an available implementation.

like image 230
Elazar Leibovich Avatar asked Dec 31 '22 01:12

Elazar Leibovich


2 Answers

I am not aware of any protocol which does this (there might be one, but I don't know it.)

I'm sure you've thought of this: why not pass the format as a binary data stream as well?

pseudocode:

struct table_format_header {
  int number_of_fields; /* number of fields that will be defined in table */
                        /* sent before the field descriptions themselves  */
};

struct table_format {
   char column_name[8];   /* name of column ("sensor");  */
   char fmt_specifier[5]; /* format specifier for column */

   ... (etc)
}

Then you can compute the fields/columns (somehow), transmit the header struct so that the recipient can allocate buffers, and then iteratively transmit table_format structs for each of those fields. The struct would have all the information you need pertaining to that header - name, number of bytes in field, whatever. If space is really constricted, you can use bit-fields (int precision:3) to specify the different attributes

like image 81
poundifdef Avatar answered Jan 01 '23 15:01

poundifdef


You may want to try protocol buffers.

http://code.google.com/p/protobuf/

Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.

Building off rascher's comment, protobufs compile the format so it's ridiculous efficient to transmit and receive. It's also extensible in case you want to add/remove fields later. And there are great APIs (e.g. protobuf python).

like image 28
ramanujan Avatar answered Jan 01 '23 13:01

ramanujan