How to resolve the error related to frame used in zstandard which requires too much memory for decoding

Question

To download the data related to questions and answers, I am following the script on facebook/ELI5.

There it says to run the command: python download_reddit_qalist.py -Q. On running this command, I get an error on line number 70 in python file 'download_reddit_qalist.py', where the zstandardDecompressor object is enumerated. The error log says that:

zstd.ZstdError: Zstd decompress error: Frame requires too much memory for decoding

Thinking the memory issue, I allocated 32 gb memory to the container along with 8 CPUs. But the error stays.

When I replaced the enumerate function with ElementTree.iterparse(), then along with this error, another message adds up:

for i, l in ET.iterparse(f):

File "/anaconda3/lib/python3.8/xml/etree/ElementTree.py", line 1229, in iterator

data = source.read(100 * 2048)

zstd.ZstdError: zstd decompress error: Frame requires too much memory for decoding

Does anyone face the similar error? I have the docker container running on the slurm cluster. If you need more information let me know.

akshit bhatia · Accepted Answer

zstdDecompressor(max_window_size=2147483648)

In future, if anyone faces this error, then above is the way to correct it.

in the file download_reddit_qalist.py, on line 66, one can change.

How to resolve the error related to frame used in zstandard which requires too much memory for decoding

Tags:

nlp

reddit

nlp-question-answering

akshit bhatia

1 Answers

akshit bhatia

Recent Activity

Donate For Us

How to resolve the error related to frame used in zstandard which requires too much memory for decoding

Tags:

nlp

reddit

nlp-question-answering

akshit bhatia

1 Answers

akshit bhatia

Related questions

Recent Activity

Donate For Us