Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to move a Python script from one subpackage to another directory/package, maintaining backwards compatibility

I have a shared Python code base, and I'm responsible for code that others depend on. I need to move modules from one subpackage to another directory/package to reorganize it. How do I do that in the safest way?

If I just move the code, I've got to worry about others who use it who may not have redirected their imports of it. If it's moved and the users of the code don't change their imports, their code will unexpectedly fail when the imports fail.

How can I ensure a seamless transition? Do I just copy the code and leave the old code in place until the imports have been changed? Are there any caveats to be aware of? What if I use import * in conjunction with an __all__? In what case would I have to support imports from the old location indefinitely?

like image 970
Russia Must Remove Putin Avatar asked Jun 14 '15 18:06

Russia Must Remove Putin


People also ask

How do I move a Python package to another directory?

Use the shutil.move() function The shutil. move() function is used to move a file from one directory to another. First, import the shutil module and Pass a source file path and destination directory path to the move(src, dst) function.

How does Python handle relative imports?

Relative imports make use of dot notation to specify location. A single dot means that the module or package referenced is in the same directory as the current location. Two dots mean that it is in the parent directory of the current location—that is, the directory above.


1 Answers

I was recently asked to move code from one subpackage to another at work, and the approach I used didn't seem immediately obvious to other developers, so I'm documenting it here for others.

I don't recommend leaving a copy at the old place. If you have two copies of the same script, one is likely to change without the other. Instead, I recommend the following multi-step process.

The first step involves two parts that can be implemented simultaneously if you control both locations for the code.

Step One: Implement the Move

  1. First, move the file from the old place to the new place under version control. I use a simplified interface to CVS so it was a version control copy. In most other version control systems (like mercurial, subversion, and git), you should use mv to move the file, e.g. with git:

     git mv /location/old/script.py /location/new/script.py
    

Important:

Don't forget to move the unittests too, and also move __init__.py's if there's code in old ones that need to be kept. Otherwise, just make sure to commit __init__.py's there if they're not already in place or

  1. Next, in the place of the old code, import all the names from the new location,

    so in /location/old/script.py:

     from location.new.script import *
    

    and leave a comment explaining why this is needed, and commit the change to version control. If you moved the __init__.py, just make sure to commit a new empty __init__.py.

There's a major caveat here. import * is affected by __all__. If you have an __all__ declared, you've got two approaches to providing the missing names. You can import them all explicitly:

from location.new.script import *
# names not in the new.script's __all__:
from location.new.script import foo, bar, baz 

or you can delete the file and instead import the module in the __init__.py, and add the path to sys.modules like this:

from location.new import script
import sys
sys.modules['location.old.script'] = script

This code will initialize the package and add the module to sys.modules just in time for it to be looked up there by the importer. This is the same way that os.path is created in the Python source. Most people would shy away from modifying sys.modules, however. In fact, I hesitate to leave this suggestion here, and I would not if it were not in the Python standard library.

These two parts can together be pushed into production, and the move has been seamlessly implemented. If you have no control over users of your code, this may need to remain in place indefinitely for backwards compatibility.

Optional: I would then delete the old script at head (just at head, don't push it yet!) so that other developers can see the change coming and address the change in a timely fashion.

Step Two: Implement the Rereferencing

If you can do a regular expression search of all code that depends on your code, I recommend searching the code for the following regular expression:

(import|from).*location\.old.*script

If you are on Unix (or have Cygwin) you can do a regex search for it:

grep -rEe "(import|from).*location\.old.*script" .

Or most IDEs have regex search.

If you do have control over the code that uses it, or you have a view on others that use it, it's fairly straightforward to change the imports from the old to the new, e.g. from :

import location.old.script

to

import location.new.script

and from

from location.old import script

to

from location.new import script 

And so on.

Important:

All of these changes need to be implemented and released to production. If any production installations remain without these done, if you delete the old location, they will fail.

Step Three: delete the old script in production

This is the dangerous step. If you've missed any users/importers, their code will fail until they get their import fixes into production. You may choose to postpone this step indefinitely, but my preference is to get it done in a timely fashion if I can prove all of the changes have been pushed into production.

If you deleted it at head immediately after making the change so that others could see the change coming in development, you may have less to worry about.

Still, do not delete this until you can prove that no other users are referencing the old package location in production. If you can't prove it, don't delete it.

like image 74
Russia Must Remove Putin Avatar answered Sep 23 '22 19:09

Russia Must Remove Putin