Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does searching with pip work?

Yes, I'm dead serious with this question. How does searching with pip work?

The documentation of the keyword search refers to a "pip search reference" at https://pip.pypa.io/en/stable/user_guide/#searching-for-packages which is everything but a reference.

I can't conclude from search attempts how searching works. E.g. if I search for "exec" I get a variety of results such as exec-pypeline (0.4.2) - an incredible python package. I even get results with package names that have nothing to do with "exec" as long as the term "exec" is in the description.

But strangely I don't see one of my own packages in the list though one of the packages contains exec in it's name. That alone now would lead us to the conclusion that pip (at least) searches for complete search terms in the package description (which my package doesn't have).

But building on that assumption if I search for other terms that are provided in the package description I don't get my package listed either. And that applies to other packages as well: E.g. if I search for "projects" I don't get flask-macros in the result set though the term "projects" clearly exists in the description of flask-macros. So as this contradicts the assumption above this is clearly not the way how searching works.

And interestingly I can search for "macro" and get "flask-macros" as a result, but if I search for "macr" "flask-macros" is not found.

So how exactly is searching performed by pip? Where can a suitable reference be found for this?

like image 384
Regis May Avatar asked Jul 10 '18 15:07

Regis May


People also ask

Where does pip find packages?

The pip command looks for the package in PyPI, resolves its dependencies, and installs everything in your current Python environment to ensure that requests will work. The pip install <package> command always looks for the latest version of the package and installs it.

How do I check my pip list?

List Installed Packages with Pip. Both pip list and pip freeze will generate a list of installed packages, just with differently formatted results. Keep in mind that pip list will list ALL installed packages (regardless of how they were installed). while pip freeze will list only everything installed by Pip.


1 Answers

pip search looks for substring contained in the distribution name or the distribution summary. I can not see this documented anywhere, and found it by following the command in the source code directly.

The code for the search feature, which dates from Feb 2010, is still using an old xmlrpc_client. There is issue395 to change this, open since 2011, since the XML-RPC API is now considered legacy and should not be used. Somewhat surprisingly, the endpoint was not deprecated in the pypi-legacy to warehouse move, as the legacy routes are still there.

flask-macros did not show up in a search for "project" because this is too common a search term. Only 100 results are returned, this is a hardcoded limit in the elasticsearch view which handles the requests to those PyPI search routes. Note that this was reduced from 1000 fairly recently in PR3827.

Code to do a search with an API client directly:

import xmlrpc.client

client = xmlrpc.client.ServerProxy('https://pypi.org/pypi')
query = 'project'
results = client.search({'name': query, 'summary': query}, 'or')
print(len(results), 'results returned')
for result in sorted(results, key=lambda data: data['name'].lower()):
    print(result)

edit: The 100 result limit is now documented here.

like image 52
wim Avatar answered Oct 25 '22 10:10

wim