How do I fix a ValueError: read of closed file exception?

Question

This simple Python 3 script:

import urllib.request

host = "scholar.google.com"
link = "/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0"
url = "http://" + host + link
filename = "cite0.bib"
print(url)
urllib.request.urlretrieve(url, filename)

raises this exception:

Traceback (most recent call last):
  File "C:\Users
icardo\Desktop\Google-Scholar\BibTex	est2.py", line 8, in <module>
    urllib.request.urlretrieve(url, filename)
  File "C:\Python32\lib\urllib
equest.py", line 150, in urlretrieve
    return _urlopener.retrieve(url, filename, reporthook, data)
  File "C:\Python32\lib\urllib
equest.py", line 1597, in retrieve
    block = fp.read(bs)
ValueError: read of closed file

I thought this might be a temporary problem, so I added some simple exception handling like so:

import random
import time
import urllib.request

host = "scholar.google.com"
link = "/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0"
url = "http://" + host + link
filename = "cite0.bib"
print(url)
while True:
    try:
        print("Downloading...")
        time.sleep(random.randint(0, 5))
        urllib.request.urlretrieve(url, filename)
        break
    except ValueError:
        pass

but this just prints Downloading... ad infinitum.

mouad · Accepted Answer

Your URL return a 403 code error and apparently urllib.request.urlretrieve is not good at detecting all the HTTP errors, because it's using urllib.request.FancyURLopener and this latest try to swallow error by returning an urlinfo instead of raising an error.

About the fix if you still want to use urlretrieve you can override FancyURLopener like this (code included to also show the error):

import urllib.request
from urllib.request import FancyURLopener


class FixFancyURLOpener(FancyURLopener):

    def http_error_default(self, url, fp, errcode, errmsg, headers):
        if errcode == 403:
            raise ValueError("403")
        return super(FixFancyURLOpener, self).http_error_default(
            url, fp, errcode, errmsg, headers
        )

# Monkey Patch
urllib.request.FancyURLopener = FixFancyURLOpener

url = "http://scholar.google.com/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0"
urllib.request.urlretrieve(url, "cite0.bib")

Else and this is what i recommend you can use urllib.request.urlopen like so:

fp = urllib.request.urlopen('http://scholar.google.com/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0')
with open("citi0.bib", "w") as fo:
    fo.write(fp.read())

How do I fix a ValueError: read of closed file exception?

Tags:

python

python-3.x

urllib

Ricardo Altamirano

1 Answers

mouad

Recent Activity

Donate For Us

How do I fix a ValueError: read of closed file exception?

Tags:

python

python-3.x

urllib

Ricardo Altamirano

1 Answers

mouad

Related questions

Recent Activity

Donate For Us