Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add package data recursively in Python setup.py?

I have a new library that has to include a lot of subfolders of small datafiles, and I'm trying to add them as package data. Imagine I have my library as so:

 library     - foo.py     - bar.py  data    subfolderA       subfolderA1       subfolderA2    subfolderB       subfolderB1        ... 

I want to add all of the data in all of the subfolders through setup.py, but it seems like I manually have to go into every single subfolder (there are 100 or so) and add an init.py file. Furthermore, will setup.py find these files recursively, or do I need to manually add all of these in setup.py like:

package_data={   'mypackage.data.folderA': ['*'],   'mypackage.data.folderA.subfolderA1': ['*'],   'mypackage.data.folderA.subfolderA2': ['*']    }, 

I can do this with a script, but seems like a super pain. How can I achieve this in setup.py?

PS, the hierarchy of these folders is important because this is a database of material files and we want the file tree to be preserved when we present them in a GUI to the user, so it would be to our advantage to keep this file structure intact.

like image 632
Dashing Adam Hughes Avatar asked Dec 27 '14 04:12

Dashing Adam Hughes


People also ask

How do I add data to a Python package?

Place the files that you want to include in the package directory (in our case, the data has to reside in the roman/ directory). Add the field include_package_data=True in setup.py. Add the field package_data={'': [... patterns for files you want to include, relative to package dir...]} in setup.py .

How do I create a Python package using setup py?

Installing Python Packages with Setup.py To install a package that includes a setup.py file, open a command or terminal window and: cd into the root directory where setup.py is located. Enter: python setup.py install.

What is Python Setuptools used for?

Setuptools is a package development process library designed to facilitate packaging Python projects by enhancing the Python standard library distutils (distribution utilities). Essentially, if you are working with creating and distributing Python packages, it is very helpful.


2 Answers

The problem with the glob answer is that it only does so much. I.e. it's not fully recursive. The problem with the copy_tree answer is that the files that are copied will be left behind on an uninstall.

The proper solution is a recursive one which will let you set the package_data parameter in the setup call.

I've written this small method to do this:

import os  def package_files(directory):     paths = []     for (path, directories, filenames) in os.walk(directory):         for filename in filenames:             paths.append(os.path.join('..', path, filename))     return paths  extra_files = package_files('path_to/extra_files_dir')  setup(     ...     packages = ['package_name'],     package_data={'': extra_files},     .... ) 

You'll notice that when you do a pip uninstall package_name, that you'll see your additional files being listed (as tracked with the package).

like image 120
Sandy Chapman Avatar answered Sep 29 '22 07:09

Sandy Chapman


  1. Use Setuptools instead of distutils.
  2. Use data files instead of package data. These do not require __init__.py.
  3. Generate the lists of files and directories using standard Python code, instead of writing it literally:

    data_files = [] directories = glob.glob('data/subfolder?/subfolder??/') for directory in directories:     files = glob.glob(directory+'*')     data_files.append((directory, files)) # then pass data_files to setup() 
like image 26
Kevin Avatar answered Sep 29 '22 07:09

Kevin