Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy boolean array with 1 bit entries

Is there a way in numpy to create an array of booleans that uses just 1 bit for each entry?

The standard np.bool type is 1 byte, but this way I use 8 times the required memory.

On Google I found that C++ has std::vector<bool>.

like image 871
Andrea Zonca Avatar asked Apr 09 '11 00:04

Andrea Zonca


People also ask

How do you make a boolean NumPy array?

A boolean array can be created manually by using dtype=bool when creating the array. Values other than 0 , None , False or empty strings are considered True. Alternatively, numpy automatically creates a boolean array when comparisons are made between arrays and scalars or between arrays of the same shape.

What does ::- 1 mean in NumPy array?

The -1 stands for "unknown dimension" which can be inferred from another dimension. In this case, if you set your matrix like this: a = numpy.matrix([[1, 2, 3, 4], [5, 6, 7, 8]])

Does NumPy arrays support boolean indexing?

We can also index NumPy arrays using a NumPy array of boolean values on one axis to specify the indices that we want to access. This will create a NumPy array of size 3x4 (3 rows and 4 columns) with values from 0 to 11 (value 12 not included).

Do NumPy arrays start at 0 or 1?

Access Array ElementsThe indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.


2 Answers

To do this you can use numpy's packbits and unpackbits:

import numpy as np # original boolean array A1 = np.array([     [0, 1, 1, 0, 1],     [0, 0, 1, 1, 1],     [1, 1, 1, 1, 1], ], dtype=bool)  # packed data A2 = np.packbits(A1, axis=None)  # checking the size print(len(A1.tostring())) # 15 bytes print(len(A2.tostring())) #  2 bytes (ceil(15/8))  # reconstructing from packed data. You need to resize and reshape A3 = np.unpackbits(A2, count=A1.size).reshape(A1.shape).view(bool)  # and the arrays are equal print(np.array_equal(A1, A3)) # True 

Prior to numpy 1.17.0, the first function is straight-forward to use, but reconstruction required additional manipulations. Here is an example:

import numpy as np # original boolean array A1 = np.array([     [0, 1, 1, 0, 1],     [0, 0, 1, 1, 1],     [1, 1, 1, 1, 1], ], dtype=np.bool)  # packed data A2 = np.packbits(A1, axis=None)  # checking the size print(len(A1.tostring())) # 15 bytes print(len(A2.tostring())) #  2 bytes (ceil(15/8))  # reconstructing from packed data. You need to resize and reshape A3 = np.unpackbits(A2, axis=None)[:A1.size].reshape(A1.shape).astype(np.bool)  # and the arrays are equal print(np.array_equal(A1, A3)) # True 
like image 152
Salvador Dali Avatar answered Sep 28 '22 04:09

Salvador Dali


You want a bitarray:

efficient arrays of booleans -- C extension

This module provides an object type which efficiently represents an array of booleans. Bitarrays are sequence types and behave very much like usual lists. Eight bits are represented by one byte in a contiguous block of memory. The user can select between two representations; little-endian and big-endian. All of the functionality is implemented in C. Methods for accessing the machine representation are provided. This can be useful when bit level access to binary files is required, such as portable bitmap image files (.pbm). Also, when dealing with compressed data which uses variable bit length encoding, you may find this module useful...

like image 23
Chris Eberle Avatar answered Sep 28 '22 05:09

Chris Eberle