Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TypeError("cannot pickle '_io.BufferedReader' object")

I'm new to multi-processing and I am trying to write a program that gets the top 10 results for a search query on google. In this example, I just want to run 2 search queries simultaneously. Here is what I have:

import threading
from multiprocessing.pool import Pool
import pycountry
import bs4
import requests
from googlesearch import search

def getGoogleResults(query):
    links = []
    # from geeks4geeks
    print("Getting google results...")
    for j in search(query, tld="co.in", num=10, stop=10, pause=2):
        links.append(j)
    print("Got google results!")
    return links

global queryResults
queryResults = {}

queries = ["stackoverflow", "github"]

if __name__ == "__main__":
    with Pool(2) as p:
            p.map(getGoogleResults, queries)

However, when I run it, I get the following error:

File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 771, in get
    raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x101b23820>'. Reason: 'TypeError("cannot pickle '_io.BufferedReader' object")'

I haven't been able to find any place where this issue is addressed. Any help is greatly appreciated!

I've sorta narrowed it down to the .append section, but I'm not sure how to fix this. There are many articles regarding this problem, but no answers.

like image 803
NYT Avatar asked Jun 25 '26 19:06

NYT


1 Answers

I hope this is not too late. I got the same error trying to map with a multiprocessing pool. What I did was switch to ThreadPoolExecutor, same usage as the multiprocessing pool.

from concurrent import futures

with futures.ThreadPoolExecutor(10) as executor:
    hocr_data = executor.map(convert_pdf_to_hocr, image_pdf_pages)

Give it a try.

like image 131
HoangNgx Avatar answered Jun 28 '26 09:06

HoangNgx