I need to run a code that contains these lines:
from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')
There seems to be a problem with executing it.
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
As the code tries to download something from somewhere and my internet connecton works well, I assume that server that it wants to access is down.
How can I set it up manually?
fetch_mldata
will by default check the data in `'~/scikit_learn_data/mldata' to see if the dataset is already downloaded or not.
According to source code
# if the file does not exist, download it
if not exists(filename):
urlname = MLDATA_BASE_URL % quote(dataname)
So in your case, it will check the location
~/scikit_learn_data/mldata/mnist-original.mat
and if not found, it will download from
http://mldata.org/repository/data/download/matlab/mnist-original.mat
which currently is down as you suspected.
So what you can do is download the dataset from any other location like this:
https://github.com/amplab/datascience-sp14/blob/master/lab7/mldata/mnist-original.mat
and keep that in the above folder.
After that when you run fetch_mldata()
it should pick the downloaded dataset without connecting mldata.org.
Here ~
refers to the user home folder. You can use the following code to know the default location of that folder according to your system.
from sklearn.datasets import get_data_home
print(get_data_home())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With