I want to download the file in the following url using python. I tried with the following code but it seems like not working. I think the error is in the file format. I would be glad if you can suggest the modifications to the code or a new code that I can use for this purpose
Link to the website
https://www.gov.uk/government/statistics/transport-use-during-the-coronavirus-covid-19-pandemic
URL required to be downloaded
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/959864/COVID-19-transport-use-statistics.ods
My Code
from urllib import request
response = request.urlopen("https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/959864/COVID-19-transport-use-statistics.ods")
csv = response.read()
csvstr = str(csv).strip("b'")
lines = csvstr.split("\\n")
f = open("historical.csv", "w")
for line in lines:
   f.write(line + "\n")
f.close()
Here basically I only want to download the file. I have heard that Beautifulsoup can be used for that but I don't have much experience on this. Any code that would serve my purpose is highly appreciated
Thanks
To download the file:
In [1]: import requests
In [2]: url = 'https://assets.publishing.service.gov.uk/government/uploads/syste
   ...: m/uploads/attachment_data/file/959864/COVID-19-transport-use-statistics.
   ...: ods'
In [3]: with open('COVID-19-transport-use-statistics.ods', 'wb') as out_file:
   ...:     content = requests.get(url, stream=True).content
   ...:     out_file.write(content)
And then you can use pandas-ods-reader to read the file by running:
pip install pandas-ods-reader
Then:
In [4]: from pandas_ods_reader import read_ods
In [5]: df = read_ods('COVID-19-transport-use-statistics.ods', 1)
In [6]: df
Out[6]: 
                   Department for Transport statistics  ...   unnamed.9
0    https://www.gov.uk/government/statistics/trans...  ...        None
1                                                 None  ...        None
2    Use of transport modes: Great Britain, since 1...  ...        None
3    Figures are percentages of an equivalent day o...  ...        None
4                                                 None  ...  Percentage
..                                                 ...  ...         ...
390                  Transport for London Tube and Bus  ...        None
391                               Buses (excl. London)  ...        None
392                                           Cycling   ...        None
393                                  Any other queries  ...        None
394                                    Media enquiries  ...        None
And you can save it as a csv if that is what you want using df.to_csv('my_data.csv', index=False)
I see that you are just trying to download the file that is .ods format and I think saving it in .csv wont convert it into a csv file.
Following code would help you download the file. I have used requests library which is a better option in place of urllib.
import requests
file_url = "https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/959864/COVID-19-transport-use-statistics.ods"
file_data = requests.get(file_url).content
# create the file in write binary mode, because the data we get from net is in binary
with open("historical.ods", "wb") as file:
    file.write(file_data)
Output file can be viewed in MS Excel.

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With