the function signature for pandas.read_csv
gives, among others, the following options:
read_csv(filepath_or_buffer, low_memory=True, memory_map=False, iterator=False, chunksize=None, ...)
I couldn't find any documentation for either low_memory
or memory_map
flags. I am confused about whether these features are implemented yet and if so how do they work.
Specifically,
memory_map
: If implemented does it use np.memmap
and if so does it store the individual columns as memmap or the rows. low_memory
: Does it specify something like cache
to store in memory?DataFrame
to a memmapped DataFrame
P.S. : versions of relevant modules
pandas==0.14.0
scipy==0.14.0
numpy==1.8.1
I will attempt to sum up the comments to this question and also add my own research into one comprehensive answer.
low_memory
option is kind of depricated, as in that it does not actually do anything anymore (source).
memory_map
does not seem to use the numpy memory map as far as I can tell from the source code It seems to be an option for how to parse the incoming stream of data, not something that matters for how the dataframe you receive works.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With