Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python shutil copytree: use ignore function to keep specific files types

I'm trying to figure out how to copy CAD drawings (".dwg", ".dxf) from a source directory with subfolders to a destination directory and maintaining the original directory and subfolders structure.

  • Original Directory: H:\Tanzania...\Bagamoyo_Single_line.dwg
  • Source Directory: H:\CAD\Tanzania...\Bagamoyo_Single_line.dwg

I found the following answer from @martineau within the following post: Python Factory Function

from fnmatch import fnmatch, filter
from os.path import isdir, join
from shutil import copytree

def include_patterns(*patterns):
    """Factory function that can be used with copytree() ignore parameter.

    Arguments define a sequence of glob-style patterns
    that are used to specify what files to NOT ignore.
    Creates and returns a function that determines this for each directory
    in the file hierarchy rooted at the source directory when used with
    shutil.copytree().
    """
    def _ignore_patterns(path, names):
        keep = set(name for pattern in patterns
                            for name in filter(names, pattern))
        ignore = set(name for name in names
                        if name not in keep and not isdir(join(path, name)))
        return ignore
    return _ignore_patterns

# sample usage

copytree(src_directory, dst_directory,
         ignore=include_patterns('*.dwg', '*.dxf'))

Updated: 18:21. The following code works as expected, except that I'd like to ignore folders that don't contain any include_patterns('.dwg', '.dxf')

like image 980
Peter Wilson Avatar asked Feb 27 '17 13:02

Peter Wilson


People also ask

What is the difference between Shutil copy () and Shutil Copystat () functions?

copystat() method in Python is used to copy the permission bits, last access time, last modification time, and flags value from the given source path to given destination path. The shutil. copystat() method does not affect the file content and owner and group information.

How does Shutil Copytree work?

shutil. copytree() method recursively copies an entire directory tree rooted at source (src) to the destination directory. The destination directory, named by (dst) must not already exist. It will be created during copying.

What is Shutil Rmtree in Python?

This module helps in automating the process of copying and removal of files and directories. shutil. rmtree() is used to delete an entire directory tree, path must point to a directory (but not a symbolic link to a directory). Syntax: shutil.rmtree(path, ignore_errors=False, onerror=None)


1 Answers

shutil already contains a function ignore_patterns, so you don't have to provide your own. Straight from the documentation:

from shutil import copytree, ignore_patterns  copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*')) 

This will copy everything except .pyc files and files or directories whose name starts with tmp.

It's a bit tricky (and not strictly necessary) to explain what's going on: ignore_patterns returns a function _ignore_patterns as its return value, this function gets stuffed into copytree as a parameter, and copytree calls this function as needed, so you don't have to know or care how to call this function _ignore_patterns. It just means that you can exclude certain unneeded cruft files (like *.pyc) from being copied. The fact that the name of the function _ignore_patterns starts with an underscore is a hint that this function is an implementation detail you may ignore.

copytree expects that the folder destination doesn't exist yet. It is not a problem that this folder and its subfolders come into existence once copytree starts to work, copytree knows how to handle that.

Now include_patterns is written to do the opposite: ignore everything that's not explicitly included. But it works the same way: you just call it, it returns a function under the hood, and coptytree knows what to do with that function:

copytree(source, destination, ignore=include_patterns('*.dwg', '*.dxf')) 
like image 77
Jan Avatar answered Sep 21 '22 12:09

Jan