I am trying to load my saved model from s3 using joblib
import pandas as pd import numpy as np import json import subprocess import sqlalchemy from sklearn.externals import joblib ENV = 'dev' model_d2v = load_d2v('model_d2v_version_002', ENV) def load_d2v(fname, env): model_name = fname if env == 'dev': try: model=joblib.load(model_name) except: s3_base_path='s3://sd-flikku/datalake/doc2vec_model' path = s3_base_path+'/'+model_name command = "aws s3 cp {} {}".format(path,model_name).split() print('loading...'+model_name) subprocess.call(command) model=joblib.load(model_name) else: s3_base_path='s3://sd-flikku/datalake/doc2vec_model' path = s3_base_path+'/'+model_name command = "aws s3 cp {} {}".format(path,model_name).split() print('loading...'+model_name) subprocess.call(command) model=joblib.load(model_name) return model
But i was facing this error:
from sklearn.externals import joblib ImportError: cannot import name 'joblib' from 'sklearn.externals' (C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\externals\__init__.py)
Then i tried installing joblib directly by doing
import joblib
but it gave me this error
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 8, in load_d2v_from_s3 File "/home/ec2-user/.local/lib/python3.7/site-packages/joblib/numpy_pickle.py", line 585, in load obj = _unpickle(fobj, filename, mmap_mode) File "/home/ec2-user/.local/lib/python3.7/site-packages/joblib/numpy_pickle.py", line 504, in _unpickle obj = unpickler.load() File "/usr/lib64/python3.7/pickle.py", line 1088, in load dispatch[key[0]](self) File "/usr/lib64/python3.7/pickle.py", line 1376, in load_global klass = self.find_class(module, name) File "/usr/lib64/python3.7/pickle.py", line 1426, in find_class __import__(module, level=0) ModuleNotFoundError: No module named 'sklearn.externals.joblib'
Can you tell me how to solve this? Thanks in advance
Sklearn Joblib SummaryYou can connect joblib to the Dask backend to scale out to a remote cluster for even faster processing times. You can use XGBoost-on-Dask and/or dask-ml for distributed machine learning training on datasets that don't fit into local memory.
You can use the official six package. First Install six using: pip install six and then you import the module. No need to downgrade scikit-learn.
Joblib is a set of tools to provide lightweight pipelining in Python. In particular: transparent disk-caching of functions and lazy re-evaluation (memoize pattern) easy simple parallel computing.
You should directly use
import joblib
instead of
from sklearn.externals import joblib
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With