How can i skip files that does not exist file in the repository using python?

Tags:

python

xml

I want to download xml file from issue tracking system one by one.It produces error message when file not exist in the repository. I include the python script to better understand my problem.

My code:

import urllib.request
for i in range(0,1000):
    issue_id1='DERBY-'+str(i)
    url ="https://issues.apache.org/jira/si/jira.issueviews:issue-xml/"+issue_id1+'/'+issue_id1+'.xml'
    s=urllib.request.urlopen(url)
    contents = s.read()
    file = open(issue_id1+'.xml', 'wb')
    file.write(contents)

file.close()

Stack Track:

Traceback (most recent call last):
  File "/PhP/Learning/xmldownlaod.py", line 10, in <module>
    s=urllib.request.urlopen(url)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 161, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 469, in open
    response = meth(req, response)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 579, in http_response
    'http', request, response, code, msg, hdrs)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 507, in error
    return self._call_chain(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 441, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 587, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

967

asked Mar 29 '17 03:03

Reja

1 Answers

Python uses "try except" blocks for error handling

import urllib.request
from urllib.error import URLError # the docs say this is the base error you need to catch
for i in range(0,1000):
    issue_id1='DERBY-'+str(i)
    url ="https://issues.apache.org/jira/si/jira.issueviews:issue-xml/"+issue_id1+'/'+issue_id1+'.xml'
    try:
        s=urllib.request.urlopen(url)
        contents = s.read()
    except URLError:
        print('an error occurred while fetching: "{}"'.format(url))
        continue # skip this url and proceed to the next
    file = open(issue_id1+'.xml', 'wb')
    file.write(contents)

118

answered Sep 18 '22 18:09

Peter Gibson

Related questions
                            
                                How can I use Canvas Data REST API using python?
                            
                                ValueError: day is out of range for month datetime
                            
                                do something when terminate a python program
                            
                                Statsmodels SARIMAX: How can I deal with the maxlag error?
                            
                                Threaded=true Flask larger application file structure
                            
                                ImportError: No module named base in html5lib
                            
                                How to handle exception in using(Py.GIL()) block pythonnet
                            
                                How to install theano with python3.6?
                            
                                What is the most efficient and portable way to generate Gaussian random numbers in cython?
                            
                                Why is 'tornado.ioloop.IOLoop.instance().start()' giving me an error?
                            
                                Python glob -- get newest file from list
                            
                                superimpose matplotlib quiver on image
                            
                                SyntaxError: keyword argument repeated
                            
                                imshow colormap figure and the suptitle don't align in the center
                            
                                gitlab-ci.yml python -c 'multiple line cmd' failed
                            
                                Pandas series mean and standard deviation
                            
                                How to loop over nextPageToken using GoogleDrive's Python Quickstart
                            
                                OpenCV canny edge detection is not working properly on ideal square
                            
                                How can I click a pushButton on my PyQt5 code and allow it to execute/run another .py file?
                            
                                MkDocs and MathJax

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With