Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to resolve the error related to frame used in zstandard which requires too much memory for decoding

To download the data related to questions and answers, I am following the script on facebook/ELI5.

There it says to run the command: python download_reddit_qalist.py -Q. On running this command, I get an error on line number 70 in python file 'download_reddit_qalist.py', where the zstandardDecompressor object is enumerated. The error log says that:

zstd.ZstdError: Zstd decompress error: Frame requires too much memory for decoding

Thinking the memory issue, I allocated 32 gb memory to the container along with 8 CPUs. But the error stays.

When I replaced the enumerate function with ElementTree.iterparse(), then along with this error, another message adds up:

for i, l in ET.iterparse(f):

File "/anaconda3/lib/python3.8/xml/etree/ElementTree.py", line 1229, in iterator

data = source.read(100 * 2048)

zstd.ZstdError: zstd decompress error: Frame requires too much memory for decoding

Does anyone face the similar error? I have the docker container running on the slurm cluster. If you need more information let me know.

like image 423
akshit bhatia Avatar asked Oct 27 '25 03:10

akshit bhatia


1 Answers

zstdDecompressor(max_window_size=2147483648)

In future, if anyone faces this error, then above is the way to correct it.

in the file download_reddit_qalist.py, on line 66, one can change.

like image 60
akshit bhatia Avatar answered Oct 29 '25 22:10

akshit bhatia