Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

simply use python anaconda without internet connection

I would like to deploy a python environment on production servers that have no access to the internet.

I discovered Python Anaconda distribution and installed it to give it a try. The installation directory is 1.6GB, and I can see in pkgs directory that a lot of libraries are there.

However, when I try to install an environment, conda does not lookup in the local directories...

conda create --offline --use-local --dry-run  --name pandas_etl python
Using Anaconda Cloud api site https://api.anaconda.org
Fetching package metadata:
Solving package specifications:
Error:  Package missing in current linux-64 channels:
  - python

So, what is the point to bundle all those libraries if conda needs to pick them up on online repositories? Maybe am I missing something?

I am looking for a kind of "lots of batteries included python" for convenient deployment.

Note: I use a Linux system and installed the regular anaconda, not the miniconda

like image 834
stockersky Avatar asked May 23 '16 13:05

stockersky


People also ask

Does Anaconda work without internet?

In the Preferences dialog, select Enable offline mode to enter offline mode even if internet access is available. Using Navigator in offline mode is equivalent to using the command line conda commands create , install , remove , and update with the flag --offline so that conda does not connect to the internet.

How can we create a conda environment without internet?

Download anaconda installer, and install it on the reseacrh computer, Copy the packages from the pkg folder on my Internet-full device to the pkg folder on the Internet-deprived research computer, On the research computer, run conda create --name MyWonderfulEnv tensorflow-gpu=1.10.

Can I use Python without Anaconda?

You can use conda without Anaconda, but using Anaconda always involves the conda tool. module load python/3.4. x-anaconda Python 3 is the latest version of the language and python 2 is considered legacy. Generally you should choose Python 3 for new projects whenever possible.


2 Answers

Well, after playing around with Pandas while reading Fabio Nelli book 'Python Data Analytic', I realize how much Pandas is an awesome library. SO, i've been working with Anaconda to make it work in my environment.

1- Download the Anaconda installer and install it (I guess miniconda will be enough)

2- Make a local channel by mirroring the (part of) anaconda repository

Do not try to download individual packages on your workstation to push them to your offline server. Indeed, dependencies will not be satisfied. Packages need to be contained in a channel and indexed in metadata files (repodata.json and repodata.json.bz2) to be properly 'glued' together.

I used wget to mirror a part of the anaconda repository : https://repo.continuum.io/pkgs/ I used something like this to filter out packages in order not to download the whole repo :

wget -r --no-parent -R --regex-type pcre --reject-regex '(.*py2[67].*)|(.*py[34].*)' https://repo.continuum.io/pkgs/free/linux-64/

Beware, not to use something like "only py35" packages. Indeed, many packages in the repo don't have version string in their name; and you'll miss them as dependency.

Well, i guess you can filter more accurately. I fetched about 6GB of packages!

!!!! Do NOT build a custom channel from the part of the repository you just downloaded !!!! (anaconda custom channels) I tried this at first and i had this exception : "RecursionError: maximum recursion depth exceeded while calling a Python object". This is a known pb : https://github.com/conda/conda/issues/2371 ==> the maintainers discuss this : the metadatas maintained in repodata.json and repodata.json.bz2 do not reflect metadatas in individual pkg. They choose to only edit the repo metadata to fix issues instead of each package metadatas. So, if you rebuild the channel metadatas from packages, you miss.

==> So : do not rebuild channel metadata, just keep the repository metadata (repodata.json and repodata.json.bz2 contained in the official anaconda repository). Even if the whole repo is not in your new channel, it'll work (at least, if you did not filter to much while mirroring ;-) )

3- test and use your new channel

conda search -c file://Path_to_your_channel/repo.continuum.io/pkgs/free/ --override-channels

NOTE : Do not include your platform architecture in the path. Exemple : your channel tree is probably : /Path_to_your_channel/repo.continuum.io/pkgs/free/linux-64 Just omit your arch (linux-64 in my case). Conda will find out.

Update :

conda update  -c file://resto/anaconda_repo/repo.continuum.io/pkgs/free/ --override-channels --all

And so on... I guess, you can use the conda conf file of your system user to force using this local channel.

Hope it helps.

Guillaume

like image 158
stockersky Avatar answered Sep 22 '22 14:09

stockersky


Another option is to use conda-pack.
from the documentation:

On the source machine

  • Pack environment my_env into my_env.tar.gz
    $ conda pack -n my_env

  • Pack environment my_env into out_name.tar.gz
    $ conda pack -n my_env -o out_name.tar.gz

  • Pack environment located at an explicit path into my_env.tar.gz
    $ conda pack -p /explicit/path/to/my_env

On the target machine

  • Unpack environment into directory my_env
    $ mkdir -p my_env
    $ tar -xzf my_env.tar.gz -C my_env

  • Use python without activating or fixing the prefixes.
    Most python libraries will work fine, but things that require prefix cleanups will fail.
    $ ./my_env/bin/python

  • Activate the environment. This adds my_env/bin to your path
    $ source my_env/bin/activate

  • Run python from in the environment
    (my_env)$ python

  • Cleanup prefixes from in the active environment.

  • Note that this command can also be run without activating the environment
  • as long as some version of python is already installed on the machine.
    (my_env)$ conda-unpack

  • At this point the environment is exactly as if you installed it here

  • using conda directly. All scripts should work fine.
    (my_env)$ ipython --version

  • Deactivate the environment to remove it from your path
    (my_env)$ source my_env/bin/deactivate

like image 43
skibee Avatar answered Sep 20 '22 14:09

skibee