Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copy all files with certain extension, while maintaining directory tree

Tags:

python

My problem:

  1. Traverse a directory, and find all the header files, *.h.
  2. Copy all of these files to another location, but maintain the directory tree

What I've tried:

I've been able to gather all the headers using the os library

for root, dirs, files in os.walk(r'D:\folder\build'):
    for f in files:
        if f.endswith('.h'):
            print os.path.join(root, f)

This correctly prints:

D:\folder\build\a.h
D:\folder\build\b.h
D:\folder\build\subfolder\c.h
D:\folder\build\subfolder\d.h

Where I'm stuck:

With a list of full file paths, how can I copy these files to another location, while maintaining the sub directories? In the above example, I'd want to maintain the directory structure below \build\

For example, I'd want the copy to create the following:

D:\other\subfolder\build\a.h
D:\other\subfolder\build\b.h
D:\other\subfolder\build\subfolder\c.h
D:\other\subfolder\build\subfolder\d.h
like image 474
Cory Kramer Avatar asked Sep 03 '14 11:09

Cory Kramer


3 Answers

You can use shutil.copytree with a ignore callable to filter the files to copy.

"If ignore is given, it must be a callable that will receive as its arguments the directory being visited by copytree(), and a list of its contents (...). The callable must return a sequence of directory and file names relative to the current directory (i.e. a subset of the items in its second argument); these names will then be ignored in the copy process"

So for your specific case, you could write:

from os.path import join, isfile
from shutil import copytree

# ignore any files but files with '.h' extension
ignore_func = lambda d, files: [f for f in files if isfile(join(d, f)) and f[-2:] != '.h']
copytree(src_dir, dest_dir, ignore=ignore_func)
like image 58
pchiquet Avatar answered Nov 11 '22 21:11

pchiquet


Edit: as @pchiquet shows, it can be done in a single command. Yet I will show how this problem could be approached manually.

You're gonna need three things.

You know what directory you were traversing, so to construct the destination path, you need to replace the name of the source root directory with the name of de destination root directory:

walked_directory = 'D:\folder\build'
found_file = 'D:\other\subfolder\build\a.h'
destination_directory = 'D:\other\subfolder'
destination_file = found_file.replace('walked_directory', 'destination_directory')

Now that you have a source and a destination, first you need to make sure the destination exists:

os.makedirs(os.path.dirname(destination_file))

Once it exists, you can copy the file:

shutil.copyfile(found_file, destination_file)
like image 38
Thijs van Dien Avatar answered Nov 11 '22 20:11

Thijs van Dien


This will recursively copy all the files with '.h' extension from current dir to dest_dir without creating subdirectories inside dest_dir:

import glob
import shutil
from pathlib import Path

# ignore any files but files with '.h' extension
for file in glob.glob('**/*.h', recursive=True):
    shutil.copyfile(
        Path(file).absolute(), 
        (Past(dest_dir)/Path(file).name).absolute()
    )
like image 31
glisu Avatar answered Nov 11 '22 21:11

glisu