Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I pack numpy bool arrays into a string of bits?

Overview

In numpy, I have an array of bools. The array was retrieved from an image; it is two-dimensional and contains 1024 columns and 768 rows. I want to push this data over an Ethernet cable. There are multiple ways to do this, but for my purposes speed is extremely crucial, and therefore memory is also very crucial.

Since there are 1024 x 768 = 786432 elements (pixels) in each array, and each element is either True or False, it is theoretically possible to pack the array into 98,304 uncompressed bytes or 96 kilobytes.

786432 bits / 8 bits per byte =         98304 bytes
98304 bytes / 1024 bytes per kilobyte = 96    kilobytes

This requires flattening the array

[ [True, False, True, ..., True]
  [False, True, True, ..., True]
  ...
  [True, True, False, ..., False] ]

# flatten the array

[True, False, True, ..., False]

Which can theoretically be represented as bits of bytes, since 786,432 bits fits evenly into 98,304 bytes; each array should be able to be represented by 98,304 eight-bit chars.

The Question

How can I quickly send 1024-by-768 bool numpy arrays over Ethernet? I'm looking into the bitstring python library, but I'm not sure how to quickly pipe the numpy arrays into a bitstring class.

Additional Information / Questions

To be specific, I'm sending these arrays from a Raspberry Pi 2 to a regular Raspberry Pi.

  1. Is socket and SOCK_STREAM the fastest way to go about this?
  2. Given the RPis computing power, would it be faster to compress and decompress the arrays? If so, the compression must be lossless.
  3. I've looked into serializing the numpy arrays rather than using bitstring stuff, but the pickled objects are too large to send over SOCK_STREAM. Am I doing something wrong with socket stuff?

My Code / Solution [SOLVED]

Client

import socket
from scipy.misc import imread
import numpy

IP = '127.0.0.1'
PORT = 7071
ADDRESS = (IP, PORT)

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

image = imread('input.png')[:,:,[2]]
image[image < 170] = 0
image[image != 0] = 1
image = numpy.reshape(image, (-1, 1))
image = numpy.packbits(image)
data = image.tostring()

sock.connect(ADDRESS)
for i in range(0, 93804, 1024):
    sock.send(data[i:i+1024])
sock.shutdown(socket.SHUT_WR)
sock.close()

Server

import socket
from scipy.misc import imsave
import numpy

IP = '127.0.0.1'
PORT = 7071
ADDRESS = (IP, PORT)

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(ADDRESS)
sock.listen(1)

while True:
    c, addr = sock.accept()
    data = ''
    package = c.recv(1024)
    while package:
        data += package
        package = c.recv(1024)
    image = numpy.fromstring(data, dtype=numpy.uint8)
    image = numpy.unpackbits(image)
    image = numpy.reshape(image, (-1, 768))
    imsave('output.png', image)
    c.close()
sock.close()

As you can see, I ended up each array over TCP/SOCK_STREAM via a series of 1024-byte packets.

like image 555
user3745189 Avatar asked Apr 27 '15 22:04

user3745189


1 Answers

You can use np.packbits to pack the contents of an np.bool array into an np.uint8 array 1/8th of the size, such that each 'packed' boolean element uses only a single bit. The original array can be recovered using np.unpackbits.

import numpy as np

x = array([0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1], dtype=np.bool)

print(x.itemsize, x.nbytes)
# (1, 16)

xp = np.packbits(x)
print(xp)
# [ 24 139]

print(xp.itemsize, xp.nbytes)
# (1, 2)

print(np.unpackbits(xp))
# [0 0 0 1 1 0 0 0 1 0 0 0 1 0 1 1]

The most obvious way to go from here would be to serialize your packed array to a raw string of bytes, pipe it through a UDP socket, then deserialize it and unpack it on the other side. numpy's native serialization (.tostring() and np.fromstring()) will probably be much faster than using pickle or cPickle.

If you want to play around with compression, one option would be to use the native zlib module to compress the string of bytes before passing it through the pipe, then decompress it on the other side. Whether you see any benefit from this will depend strongly on how compressible your input arrays are, as well as on the hardware that's doing the compression/decompression.

like image 132
ali_m Avatar answered Oct 29 '22 11:10

ali_m