Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python read() and write() in large blocks / memory management

I'm writing some python code that splices together large files at various points. I've done something similar in C where I allocated a 1MB char array and used that as the read/write buffer. And it was very simple: read 1MB into the char array then write it out.

But with python I'm assuming it is different, each time I call read() with size = 1M, it will allocate a 1M long character string. And hopefully when the buffer goes out of scope it will we freed in the next gc pass.

Would python handle the allocation this way? If so, is the constant allocation/deallocation cycle be computationally expensive?

Can I tell python to use the same block of memory just like in C? Or is the python vm smart enough to do it itself?

I guess what I'm essentially aiming for is kinda like an implementation of dd in python.

like image 999
rhlee Avatar asked Dec 26 '22 20:12

rhlee


1 Answers

Search site docs.python.org for readinto to find docs appropriate for the version of Python you're using. readinto is a low-level feature. They'll look a lot like this:

readinto(b) Read up to len(b) bytes into bytearray b and return the number of bytes read.

Like read(), multiple reads may be issued to the underlying raw stream, unless the latter is interactive.

A BlockingIOError is raised if the underlying raw stream is in non blocking-mode, and has no data available at the moment.

But don't worry about it prematurely. Python allocates and deallocates dynamic memory at a ferocious rate, and it's likely that the cost of repeatedly getting & free'ing a measly megabyte will be lost in the noise. And note that CPython is primarily reference-counted, so your buffer will get reclaimed "immediately" when it goes out of scope. As to whether Python will reuse the same memory space each time, the odds are decent but it's not assured. Python does nothing to try to force that, but depending on the entire allocation/deallocation pattern and the details of the system C's malloc()/free() implementation, it's not impossible it will get reused ;-)

like image 83
Tim Peters Avatar answered Feb 11 '23 13:02

Tim Peters