I have a fragmented structure in memory and I'd like to access it as a contiguous-looking memoryview. Is there an easy way to do this or should I implement my own solution?
For example, consider a file format that consists of records. Each record has a fixed length header, that specifies the length of the content of the record. A higher level logical structure may spread over several records. It would make implementing the higher level structure easier if it could see it's own fragmented memory location as a simple contiguous array of bytes.
Update:
It seems that python supports this 'segmented' buffer type internally, at least based on this part of the documentation. But this is only the C API.
Update2:
As far as I see, the referenced C API - called old-style buffers - does what I need, but it's now deprecated and unavailable in newer version of Python (3.X). The new buffer protocol - specified in PEP 3118 - offers a new way to represent buffers. This API is more usable in most of the use cases (among them, use cases where the represented buffer is not contiguous in memory), but does not support this specific one, where a one dimensional array may be laid out completely freely (multiple differently sized chunks) in memory.
It allows you to share memory between data-structures (things like PIL images, SQLlite data-bases, NumPy arrays, etc.) without first copying. This is very important for large data sets.
A memoryview object exposes the C level buffer interface as a Python object which can then be passed around like any other object.
¶ The Python buffer protocol, also known in the community as PEP 3118, is a framework in which Python objects can expose raw byte arrays to other Python objects. This can be extremely useful for scientific computing, where we often use packages such as NumPy to efficiently store and manipulate large arrays of data.
First - I am assuming you are just trying to do this in pure python rather than in a c extension. So I am assuming you have loaded in the different records you are interested in into a set of python objects and your problem is that you want to see the higher level structure that is spread across these objects with bits here and there throughout the objects.
So can you not simply load each of the records into a byte arrays type? You can then use python slicing of arrays to create a new array that has just the data for the high level structure you are interested in. You will then have a single byte array with just the data you are interested in and can print it out or manipulate it in any way that you want to.
So something like:
a = bytearray(b"Hello World") # put your records into byte arrays like this
b = bytearray(b"Stack Overflow")
complexStructure = bytearray(a[0:6]+b[0:]) # Slice and join arrays to form
# new array with just data from your
# high level entity
print complexStructure
Of course you will still ned to know where within the records your high level structure is to slice the arrays correctly but you would need to know this anyway.
EDIT:
Note taking a slice of a list does not copy the data in the list it just creates a new set of references to the data so:
>>> a = [1,2,3]
>>> b = a[1:3]
>>> id(a[1])
140268972083088
>>> id(b[0])
140268972083088
However changes to the list b will not change a as b is a new list. To have the changes automatically change in the original list you would need to make a more complicated object that contained the lists to the original records and hid them in such a way as to be able to decide which list and which element of a list to change or view when a user look to modify/view the complex structure. So something like:
class ComplexStructure():
def add_records(self,record):
self.listofrecords.append(record)
def get_value(self,position):
listnum,posinlist = ... # formula to figure out which list and where in
# list element of complex structure is
return self.listofrecords[listnum][record]
def set_value(self,position,value):
listnum,posinlist = ... # formula to figure out which list and where in
# list element of complex structure is
self.listofrecords[listnum][record] = value
Granted this is not the simple way of doing things you were hoping for but it should do what you need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With