To download the data related to questions and answers, I am following the script on facebook/ELI5.
There it says to run the command: python download_reddit_qalist.py -Q. On running this command, I get an error on line number 70 in python file 'download_reddit_qalist.py', where the zstandardDecompressor object is enumerated. The error log says that:
zstd.ZstdError: Zstd decompress error: Frame requires too much memory for decoding
Thinking the memory issue, I allocated 32 gb memory to the container along with 8 CPUs. But the error stays.
When I replaced the enumerate function with ElementTree.iterparse(), then along with this error, another message adds up:
for i, l in ET.iterparse(f):
File "/anaconda3/lib/python3.8/xml/etree/ElementTree.py", line 1229, in iterator
data = source.read(100 * 2048)
zstd.ZstdError: zstd decompress error: Frame requires too much memory for decoding
Does anyone face the similar error? I have the docker container running on the slurm cluster. If you need more information let me know.
zstdDecompressor(max_window_size=2147483648)
In future, if anyone faces this error, then above is the way to correct it.
in the file download_reddit_qalist.py, on line 66, one can change.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With