Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Number of installations statistics for PyPI packages?

Tags:

I've got a couple of packages on the Python Package Index (PyPI) now. Is there any way to get hold of statistics as to how many times they have been downloaded (either manually or via easy_install or pip?

Or, alternatively, how many views the main package page has received?

like image 476
robintw Avatar asked Apr 29 '12 22:04

robintw


People also ask

How many PyPI packages are there?

As of 17 January 2022, more than 350,000 Python packages can be accessed through PyPI.

How big is PyPI?

The entirety of PyPI currently requires about 10 TB of storage. Your actual storage needs will depend on your usage. Deep learning packages, such as Tensorflow and PyTorch, are notoriously large, with hundreds of gigabytes needed for each project's collection of files.

How do you see all pip installed packages?

To do so, we can use the pip list -o or pip list --outdated command, which returns a list of packages with the version currently installed and the latest available. On the other hand, to list out all the packages that are up to date, we can use the pip list -u or pip list --uptodate command.


2 Answers

There are at least two packages that help with this: pypstats and vanity. Vanity is very easy to use from the command line:

vanity numpy  

and you'll get a printout to your console.

like image 136
pastephens Avatar answered Sep 21 '22 20:09

pastephens


Pip statistics is not available on pypi.python.org website and vanity package does not work as well.

Today you can get pip statistics only through this dataset in BigQuery: https://bigquery.cloud.google.com/dataset/the-psf:pypi

Query example for https://pypi.python.org/pypi/dvc package:

SELECT   details.system.name,   COUNT(*) as download_count, FROM   TABLE_DATE_RANGE(     [the-psf:pypi.downloads],     DATE_ADD(CURRENT_TIMESTAMP(), -31, "day"),     DATE_ADD(CURRENT_TIMESTAMP(), -1, "day")   ) WHERE   file.project = 'dvc' GROUP BY details.system.name 

Please note, some of the download signals are generated by monitoring tools and should not be counted as user's downloads. For example, you should exclude null values from the output:

Row details_system_name download_count    1   Darwin  1111      2   null    10000     3   Windows 222   4   Linux   3333      
like image 30
Dmitry Petrov Avatar answered Sep 20 '22 20:09

Dmitry Petrov