Is there any guideline on selecting chunk size?
I tried different chunk size but none of them give download speed comparable to browser or wget download speed
here is snapshot of my code
r = requests.get(url, headers = headers,stream=True) total_length = int(r.headers.get('content-length')) if not total_length is None: # no content length header for chunk in r.iter_content(1024): f.write(chunk)
Any help would be appreciated.?
Edit: I tried network with different speed.. And I am able to achieve higher speed than my home network.. But when I tested wget and browser.. Speed is still not comparable
Thanks
Chunk size between 100MB and 1GB are generally good, going over 1 or 2GB means you have a really big dataset and/or a lot of memory available per core, Upper bound: Avoid too large task graphs. More than 10,000 or 100,000 chunks may start to perform poorly.
Technically the number of rows read at a time in a file by pandas is referred to as chunksize. Suppose If the chunksize is 100 then pandas will load the first 100 rows.
Larger chunk sizes normally result in a smaller deduplication database size, faster deduplication, and less fragmentation. These benefits sometimes come at the cost of less storage savings.
A chunk is the largest unit of physical disk dedicated to database server data storage. Chunks provide administrators with a significantly large unit for allocating disk space. The maximum size of an individual chunk is 4 TB.
You will lose time switching between reads and writes, and the limit of the chunk size is AFAIK only the limit of what you can store in memory. So as long as you aren't very concerned about keeping memory usage down, go ahead and specify a large chunk size, such as 1 MB (e.g. 1024 * 1024
) or even 10 MB. Chunk sizes in the 1024 byte range (or even smaller, as it sounds like you've tested much smaller sizes) will slow the process down substantially.
For a very heavy-duty situation where you want to get as much performance as possible out of your code, you could look at the io
module for buffering etc. But I think increasing the chunk size by a factor of 1000 or 10000 or so will probably get you most of the way there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With