Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to distribute files in a Python sdist that are not VCS tracked?

I would like to find the proper way to include files in a python sdist that are not tracked by git.

Context

The .mo files from my project are not tracked by git (like some other .txt files that need to be created at install time).

I have written a small function in setup.py to create them at install time, that I call in setup():

setup(
    .
    .
    .
    data_files=create_extra_files(),
    include_package_data=True,
    .
    .
    .
)

Note that they should belong to data_dir because the documentation says:

The data_files option can be used to specify additional files needed by the module distribution: configuration files, message catalogs, data files, anything which doesn’t fit in the previous categories.

So, this works well with python3 setup.py install (and bdist too). The .mo files are generated and stored at the right place.

But if I want it to work with sdist, then I must include them in MANIFEST.in (e.g. recursive-include mathmaker *.mo). Documentation says indeed:

Changed in version 3.1: All the files that match data_files will be added to the MANIFEST file if no template is provided. See Specifying the files to distribute.

(The link doesn't help much).

I am reluctant to include *.mo files in MANIFEST.in as they are not tracked by git. And check-manifest doesn't like this kind of situation, it complains about the fact that lists of files in version control and sdist do not match!

So, is there a way to fix this ugly situation?

Steps to reproduce the situation

Environment and project

To avoid polluting your environment, create and activate a dedicated virtual environment (python3.4+) in the directory of your choice:

$ pyvenv-3.4 v0
$ source v0/bin/activate
(v0) $

Reproduce following tree in a project0 directory:

.
├── .gitignore
├── MANIFEST.in
├── README.rst
├── setup.py
└── project0
    ├── __init__.py
    ├── main.py
    └── data
        └── dummy_versioned.po

Where README.rst, __init__.py and dummy_versioned.po are empty.

Content of the other files:

  • .gitignore:

    build/
    dist/
    *.egg-info
    project0/data/*.txt
    *~
    
  • MANIFEST.in:

    recursive-include project0 *.po
    recursive-include project0 *.txt
    
  • main.py:

    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    
    
    def entry_point():
        with open('project0/data/a_file.txt', mode='rt') as f:
            print(f.read())
    
  • setup.py:

    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    
    import platform
    from setuptools import setup, find_packages
    
    
    def create_files():
        txt_file_path = 'project0/data/a_file.txt'
        with open(txt_file_path, mode='w+') as f:
            f.write("Some dummy platform information: " + platform.platform())
        return [('project0/data', [txt_file_path])]
    
    
    setup(
        name='project0',
        version='0.0.1',
        author='J. Doe',
        author_email='[email protected]',
        url='http://myproject.url',
        packages=find_packages(),
        data_files=create_files(),
        include_package_data=True,
        entry_points={
            'console_scripts': ['myscript0 = project0.main:entry_point'],
        }
    )
    

Start a local git repo:

(v0) $ git init
(v0) $ git add .

Install check-manifest:

(v0) $ pip3 install check-manifest

Install and test

install works:

(v0) $ python3 setup.py install
.
.
.
copying project0/data/a_file.txt -> build/lib/project0/data
.
.
.
Finished processing dependencies for project0==0.0.1
(v0) $ myscript0 
Some dummy platform information: Linux-3.16.0-29-generic-x86_64-with-Ubuntu-14.04-trusty

If you rm project0/data/a_file.txt, then myscript0 doesn't work anymore, but reinstall it and it works again, as expected.

Building the sdist also includes a_file.txt:

(v0) $ python3 setup.py sdist
.
.
.
hard linking project0/data/a_file.txt -> project0-0.0.1/project0/data
.
.
.

Note that to have this file included in the sdist, it looks necessary (as explained in the "context" part below) to have recursive-include project0 *.txt in MANIFEST.in. Would you remove this line, python3 setup.py sdist woudln't mention a_file.txt any more (do not forget to remove any previous build/ or dist/ directories to observe this).

Conclusion

So, everything works as it is, but there is this discrepancy: a_file.txt is not tracked by git, but is included in MANIFEST.in.

check-manifest tells clearly:

lists of files in version control and sdist do not match!
missing from VCS:
  project0/data/a_file.txt

So, is there a proper way to handle this situation?

like image 596
zezollo Avatar asked Jul 18 '16 14:07

zezollo


1 Answers

As far as I get your problem you would like to add files to be distributed with the git repository but you don't want to keep track of their changes.

This can be done by this four simple steps:

Step 0: First ensure you the content inside the path/a_file.txt file matches with the content you want to distribute. As far as I know it can't be empty, so if you simply want this file to exist add a newline/space character to it.

Step 1: Add the file(s) to git using git add path/a_file.txt

Step 2: Commit the files (git commit path/a_file.txt)

Step 3: Update git's index and tell git it should ignore further changes on the files git update-index --assume-unchanged path/a_file.txt

If you ever want to do some changes to this file which should again be tracked, you can simply use the --no-assume-unchanged flag to set it active in git's index and then commit the changes.

Note that the creation of a .gitignore file which tells git to ignore the files (on all machines that clone the repository) and using git add --force path/a_file.txt won't work since git will (force) add it to the index and also keep track of the changes.

like image 127
pentix Avatar answered Nov 07 '22 01:11

pentix