In numpy, I have an array of bools. The array was retrieved from an image; it is two-dimensional and contains 1024 columns and 768 rows. I want to push this data over an Ethernet cable. There are multiple ways to do this, but for my purposes speed is extremely crucial, and therefore memory is also very crucial.
Since there are 1024 x 768 = 786432
elements (pixels) in each array, and each element is either True
or False
, it is theoretically possible to pack the array into 98,304 uncompressed bytes or 96 kilobytes.
786432 bits / 8 bits per byte = 98304 bytes
98304 bytes / 1024 bytes per kilobyte = 96 kilobytes
This requires flattening the array
[ [True, False, True, ..., True]
[False, True, True, ..., True]
...
[True, True, False, ..., False] ]
# flatten the array
[True, False, True, ..., False]
Which can theoretically be represented as bits of bytes, since 786,432 bits fits evenly into 98,304 bytes; each array should be able to be represented by 98,304 eight-bit chars.
How can I quickly send 1024-by-768 bool
numpy arrays over Ethernet? I'm looking into the bitstring
python library, but I'm not sure how to quickly pipe the numpy arrays into a bitstring
class.
To be specific, I'm sending these arrays from a Raspberry Pi 2 to a regular Raspberry Pi.
socket
and SOCK_STREAM
the fastest way to go about this?bitstring
stuff, but the pickled objects are too large to send over SOCK_STREAM
. Am I doing something wrong with socket
stuff?import socket
from scipy.misc import imread
import numpy
IP = '127.0.0.1'
PORT = 7071
ADDRESS = (IP, PORT)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
image = imread('input.png')[:,:,[2]]
image[image < 170] = 0
image[image != 0] = 1
image = numpy.reshape(image, (-1, 1))
image = numpy.packbits(image)
data = image.tostring()
sock.connect(ADDRESS)
for i in range(0, 93804, 1024):
sock.send(data[i:i+1024])
sock.shutdown(socket.SHUT_WR)
sock.close()
import socket
from scipy.misc import imsave
import numpy
IP = '127.0.0.1'
PORT = 7071
ADDRESS = (IP, PORT)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(ADDRESS)
sock.listen(1)
while True:
c, addr = sock.accept()
data = ''
package = c.recv(1024)
while package:
data += package
package = c.recv(1024)
image = numpy.fromstring(data, dtype=numpy.uint8)
image = numpy.unpackbits(image)
image = numpy.reshape(image, (-1, 768))
imsave('output.png', image)
c.close()
sock.close()
As you can see, I ended up each array over TCP/SOCK_STREAM via a series of 1024-byte packets.
You can use np.packbits
to pack the contents of an np.bool
array into an np.uint8
array 1/8th of the size, such that each 'packed' boolean element uses only a single bit. The original array can be recovered using np.unpackbits
.
import numpy as np
x = array([0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1], dtype=np.bool)
print(x.itemsize, x.nbytes)
# (1, 16)
xp = np.packbits(x)
print(xp)
# [ 24 139]
print(xp.itemsize, xp.nbytes)
# (1, 2)
print(np.unpackbits(xp))
# [0 0 0 1 1 0 0 0 1 0 0 0 1 0 1 1]
The most obvious way to go from here would be to serialize your packed array to a raw string of bytes, pipe it through a UDP socket, then deserialize it and unpack it on the other side. numpy's native serialization (.tostring()
and np.fromstring()
) will probably be much faster than using pickle
or cPickle
.
If you want to play around with compression, one option would be to use the native zlib
module to compress the string of bytes before passing it through the pipe, then decompress it on the other side. Whether you see any benefit from this will depend strongly on how compressible your input arrays are, as well as on the hardware that's doing the compression/decompression.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With