I'm running conda environments on a compute cluster where the total number of files per "project" is restricted (200k files max). I've only created a couple of conda environments (anaconda for Python 2.7; ~200 python & R packages installed in each environment; high package overlap between environments) and already hit that file number limit. Even when using conda clean -a
only a small fraction of the files are removed. Some python packages in my conda environments (e.g., boost) contain >10k files, and clean does not reduce this.
Is there any way to greatly reduce the number of files stored as part of a conda environment?
One reason is that anaconda environments are completely isolated workspaces from each other with their own copy of Python. So, the more environments you have, the larger the space needed by anaconda. But the other reason is that anaconda keeps a cache of the package files, tarballs etc.
According to the documentation you can use conda clean --packages to remove unused packages in pkgs (which will move them to pkgs/. trash from which you can then safely delete them).
Remove your environment You can use conda env remove to remove the environment. Same thing as create, you have to specify the name of the environment you wish to remove by using --name .
Anaconda uses hard links to reduce the consumed disk space. But if a limit is imposed on the number of files, each hard link counts.
As discussed in the comments, using Miniconda instead of Anaconda, and installing only the packages you actually need, might help.
If this isn't enough, I'd recommend to merge several of your environments into one. Then you'll have fewer hardlinks for the packages that overlap. Of course that is the opposite of what environments are there for, but such is the nature of workarounds.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With