I'm working with a large matrix (250x250x30 = 1,875,000 cells), and I'd like a way to set an arbitrary number of flags for each cell in this matrix, in some manner that's easy to use and reasonably space efficient.
My original plan was a 250x250x30 list array, where each element was something like: ["FLAG1","FLAG8","FLAG12"]
. I then changed it to storing just integers instead: [1,8,12]
. These integers are mapped internally by getter/setter functions to the original flag strings. This only uses 250mb with 8 flags per point, which is fine in terms of memory.
My question is: am I missing another obvious way to structure this sort of data?
Thanks all for your suggestions. I ended up rolling a few suggestions into one, sadly I can only pick one answer and have to live with upvoting the others:
EDIT: erm the initial code I had here (using sets as the base element of a 3d numpy array) used A LOT of memory. This new version uses around 500mb when filled with randint(0,2**1000)
.
import numpy
FLAG1=2**0
FLAG2=2**1
FLAG3=2**2
FLAG4=2**3
(x,y,z) = (250,250,30)
array = numpy.zeros((x,y,z), dtype=object)
def setFlag(location,flag):
array[location] |= flag
def unsetFlag(location,flag):
array[location] &= ~flag
Your solution is fine if every single cell is going to have a flag. However if you are working with a sparse dataset where only a small subsection of your cells will have flags what you really want is a dictionary. You would want to set up the dictonary so the key is a tuple for the location of the cell and the value is a list of flags like you have in your solution.
allFlags = {(1,1,1):[1,2,3], (250,250,30):[4,5,6]}
Here we have the 1,1,1 cell have the flags 1,2, and 3 and the cell 250,250,30 have the flags 4,5, and 6
edit- fixed key tuples, thanks Andre, and dictionary syntax.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With