I searched for a while now but couldn't find any satisfactory answer: How does conda (http://conda.pydata.org) work internally? Any details are welcome... Furthermore, as it is python agnostic and apparently work so well and fluently, why is it not used as a general purpose package manager like apt or yum? What are the restrictions of using only conda as package manager? Would it work? Or the other way round, why are e.g. apt and yum not able to provide the functionality conda provides? Is conda "better" than those package manager or just different? Thanks for any hints!

I explain a lot of this in my SciPy 2014 talk. Let me give a little outline here. First off, a conda package is really simple. It is just a tarball of the files that are to be installed, along with some metadata in an <code>info</code> directory. For example the conda package for <code>python</code> is a tarball of the files <pre class="prettyprint"><code>info/ files index.json ... bin/ python ... lib/ libpython.so python2.7/ ... ... ... </code></pre> You can see exactly what it looks like by looking at the extracted packages in the Anaconda <code>pkgs</code> directory. The full spec is at https://docs.conda.io/projects/conda-build/en/latest/source/package-spec.html. When conda installs this, it extracts the tarball to the <code>pkgs</code> directory and hard links the files into the installation environment. Finally, some files that have some hard coded installation paths have this replaced (usually shebang lines). That's basically it. There is some more stuff that happens in terms of dependency resolution, but once it knows what packages its going to install that's how it does it. The process of building a package is a little more complicated. @mattexx's answer and the document it links to describes a bit of the canonical way of building a package using conda build. To answer your other questions: <blockquote> Furthermore, as it is python agnostic and apparently work so well and fluently, why is it not used as a general purpose package manager like apt or yum? </blockquote> You certainly can. The only thing limiting this are the set of packages that have been built for conda. On Windows, this is a very nice option, as there aren't any system package managers like there are on Linux. <blockquote> What are the restrictions of using only conda as package manager? Would it work? </blockquote> It would work, assuming you have conda packages for everything you are interested in. The main restriction is that conda only wants to install things into the conda environment itself, so things that require specific installation locations on the system might not be well suited to conda (although it's still doable, if you set that location as your environment path). Or for instance, conda might not be a suitable replacement for "project level" package managers like bower. Also, conda probably shouldn't be used to manage system level libraries (libraries that must be installed in the <code>/</code> prefix), like kernel extensions or the kernel itself, unless you were to build out a distribution that uses conda as a package manager explicitly. The main thing I will say about these things is that conda packages are generally made to be relocatable, meaning the installation prefix of the package does not matter. This is why hard coded paths are changed as part of the install process, for instance. It also means that dynamic libraries built with conda build will have their RPATHs (on Linux) and install names (on OS X) changed automatically to use relative paths instead of absolute ones. <blockquote> Or the other way round, why are e.g. apt and yum not able to provide the functionality conda provides? Is conda "better" than those package manager or just different? </blockquote> In some ways it's better, and in some ways it's not. Your system package manager knows your system, and there are packages in there that are not going to be in conda (and some, like the kernel, that probably shouldn't be in conda). The main advantage of conda is its notion of environments. Since packages are made to be relocatable, you can install the same package in multiple places, and effectively have completely independent installs of everything, basically for free. <blockquote> Does it use some kind of containerization </blockquote> No, the only "containerization" is having separate install directories and making packages relocatable. <blockquote> or static linking of all the dependencies, </blockquote> The dependency linking is completely up to the package itself. Some packages statically link their dependencies, some don't. The dynamically linked libraries have their load paths changed as I described above to be relocatable. <blockquote> why is it so "cross platform"? </blockquote> "Cross platform" in this case means "cross operating system". Although the same binary package can't work across OS X, Linux, and Windows, the point is that conda itself works identically on all three, so if you have the same packages built for all three platforms, you can manage them all the same way regardless of which one you are on.

How does conda work internally?

1 Answers

I explain a lot of this in my SciPy 2014 talk. Let me give a little outline here.

First off, a conda package is really simple. It is just a tarball of the files that are to be installed, along with some metadata in an info directory. For example the conda package for python is a tarball of the files

info/
    files
    index.json
    ...
bin/
    python
    ...
lib/
    libpython.so
    python2.7/
        ...
    ...
...

You can see exactly what it looks like by looking at the extracted packages in the Anaconda pkgs directory. The full spec is at https://docs.conda.io/projects/conda-build/en/latest/source/package-spec.html.

When conda installs this, it extracts the tarball to the pkgs directory and hard links the files into the installation environment. Finally, some files that have some hard coded installation paths have this replaced (usually shebang lines).

That's basically it. There is some more stuff that happens in terms of dependency resolution, but once it knows what packages its going to install that's how it does it.

The process of building a package is a little more complicated. @mattexx's answer and the document it links to describes a bit of the canonical way of building a package using conda build.

To answer your other questions:

Furthermore, as it is python agnostic and apparently work so well and fluently, why is it not used as a general purpose package manager like apt or yum?

You certainly can. The only thing limiting this are the set of packages that have been built for conda. On Windows, this is a very nice option, as there aren't any system package managers like there are on Linux.

What are the restrictions of using only conda as package manager? Would it work?

It would work, assuming you have conda packages for everything you are interested in. The main restriction is that conda only wants to install things into the conda environment itself, so things that require specific installation locations on the system might not be well suited to conda (although it's still doable, if you set that location as your environment path). Or for instance, conda might not be a suitable replacement for "project level" package managers like bower.

Also, conda probably shouldn't be used to manage system level libraries (libraries that must be installed in the / prefix), like kernel extensions or the kernel itself, unless you were to build out a distribution that uses conda as a package manager explicitly.

The main thing I will say about these things is that conda packages are generally made to be relocatable, meaning the installation prefix of the package does not matter. This is why hard coded paths are changed as part of the install process, for instance. It also means that dynamic libraries built with conda build will have their RPATHs (on Linux) and install names (on OS X) changed automatically to use relative paths instead of absolute ones.

Or the other way round, why are e.g. apt and yum not able to provide the functionality conda provides? Is conda "better" than those package manager or just different?

In some ways it's better, and in some ways it's not. Your system package manager knows your system, and there are packages in there that are not going to be in conda (and some, like the kernel, that probably shouldn't be in conda).

The main advantage of conda is its notion of environments. Since packages are made to be relocatable, you can install the same package in multiple places, and effectively have completely independent installs of everything, basically for free.

Does it use some kind of containerization

No, the only "containerization" is having separate install directories and making packages relocatable.

or static linking of all the dependencies,

The dependency linking is completely up to the package itself. Some packages statically link their dependencies, some don't. The dynamically linked libraries have their load paths changed as I described above to be relocatable.

why is it so "cross platform"?

"Cross platform" in this case means "cross operating system". Although the same binary package can't work across OS X, Linux, and Windows, the point is that conda itself works identically on all three, so if you have the same packages built for all three platforms, you can manage them all the same way regardless of which one you are on.

answered Sep 28 '22 04:09

asmeurer

Related questions
                            
                                convert a 2D numpy array to a 2D numpy matrix
                            
                                PyImport_Import fails (returns NULL)
                            
                                How can we get tweets from specific country
                            
                                Linear regression with pandas dataframe
                            
                                matplotlib plot set x_ticks
                            
                                Case Insensitive Python string split() method
                            
                                Change python mro at runtime
                            
                                How can I call super() so it's compatible in 2 and 3?
                            
                                Finding index of maximum value in array with NumPy
                            
                                Check element type in BeautifulSoup 3
                            
                                Convert list of strings to dictionary
                            
                                How do I scrape pages with dynamically generated URLs using Python?
                            
                                How to set the redis timeout waiting for the response with pipeline in redis-py?
                            
                                Flask-MongoEngine & PyMongo Aggregation Query
                            
                                Is there an opposite / inverse to numpy.pad() function?
                            
                                Matplot: How to plot true/false or active/deactive data?
                            
                                Matplotlib : What is the function of cmap in imshow?
                            
                                opencv rectangle with dotted or dashed lines
                            
                                Convert an image to 2D array in python
                            
                                How to use select_related with GenericForeignKey in django?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does conda work internally?

Tags:

python

package-managers

conda

environment

SebastianNeubauer

People also ask

1 Answers

asmeurer

Recent Activity

Donate For Us