Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to move and rename documents placed in several nested folders into a new single folder with python?

Tags:

I have several files across several folders like this:

dir
├── 0
│   ├── 103425.xml
│   ├── 105340.xml
│   ├── 109454.xml
│
│── 1247
│   └── doc.xml
├── 14568
│   └── doc.xml
├── 1659
│   └── doc.xml
├── 10450
│   └── doc.xml
├── 10351
│   └── doc.xml

How can I extract all the documents into a single folder appending the folder name for each moved document:

new_dir
├── 0_103425.xml
├── 0_105340.xml
├── 0_109454.xml
├── 1247_doc.xml
├── 14568_doc.xml
├── 1659_doc.xml
├── 10450_doc.xml
├── 10351_doc.xml

I tried to extract them with:

import os

for path, subdirs, files in os.walk('../dir/'):
    for name in files:
        print(os.path.join(path, name))

UPDATE

Also, I tried to:

import os, shutil
from glob import glob

files = []
start_dir = os.getcwd()
pattern   = "*.xml"

for dir,_,_ in os.walk('../dir/'):
    files.extend(glob(os.path.join(dir,pattern))) 
for f in files:
    print(f)
    shutil.move(f, '../dir/')

The above gave me the path of each file. However, I do not understand how to rename and move them:

---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
<ipython-input-50-229e4256f1f3> in <module>()
     10 for f in files:
     11     print(f)
---> 12     shutil.move(f, '../dir/')

/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/shutil.py in move(src, dst, copy_function)
    540         real_dst = os.path.join(dst, _basename(src))
    541         if os.path.exists(real_dst):
--> 542             raise Error("Destination path '%s' already exists" % real_dst)
    543     try:
    544         os.rename(src, real_dst)

Error: Destination path '../data/230948.xml' already exists

The above error shows why I would like to rename it with its folder.

like image 905
tumbleweed Avatar asked May 08 '17 17:05

tumbleweed


People also ask

How to move files with the same name in Python?

Call shutil.move (source, destination) method by replacing source and destination by entire path in string format. Using the above method, the files with the same name will be overwritten with the file content as of the source file Example 1: Program to move a folder containing a file using python. print(source, 'exists in the destination path!')

How to move a collection of files/folders with the same name?

In this article, we will be learning on moving a collection of files and folders where there may be files/folders with the same name as in the source name in the destination. So that we may need to overwrite the existing destination file with the source file. The shutil.move () method is used to move a file or directory from one place to another.

How to move an image to a new folder in Python?

Using os.listdir () returns a list of all the files in the folder. By then using os.mkdir ('downloaded_images') the downloaded_images folder is created. Using shutil.move (), Python can then move all the files in our images list to the new folder. This process is shown in the diagram below:

How do I rename a file during a move process?

A file can be renamed during a move process using the mv command. You simply give the target path a different name. When mv moves the file, it will be given a new name. mv old-filename /new/path/new-filename. For example, to move a file named student1.txt to /var/students and rename it to class1-student1.txt, you would run the following command.


2 Answers

How does this work for you?

import os
import pathlib

OLD_DIR = 'files'
NEW_DIR = 'new_dir'

p = pathlib.Path(OLD_DIR)
for f in p.glob('**/*.xml'):
    new_name = '{}_{}'.format(f.parent.name, f.name)
    f.rename(os.path.join(NEW_DIR, new_name))

If you don't have a modern version of Python (3.5+) you can also just use glob, os, and shutil:

import os
import glob
import shutil


for f in glob.glob('files/**/*.xml'):
    new_name = '{}_{}'.format(os.path.basename(os.path.dirname(f)), os.path.basename(f))
    shutil.move(f, os.path.join('new_dir', new_name))
like image 50
Wayne Werner Avatar answered Oct 16 '22 04:10

Wayne Werner


This is easiest to do with Python 3's new pathlib module for path operations, and then shutil.move for moving the files into their correct places. Unlike os.rename, shutil.move will work like the mv command and behave correctly even for cross-filesystem moves.

This code will work for paths nested to any level - any / or \ in the paths will be replaced with _ in the target filename, so dir/foo/bar/baz/xyzzy.xml will be moved to new_dir/foo_bar_baz_xyzzy.xml.

from pathlib import Path
from shutil import move

src = Path('dir')
dst = Path('new_dir')

# create the target directory if it doesn't exist
if not dst.is_dir():
    dst.mkdir()

# go through each file
for i in src.glob('**/*'):
    # skip directories and alike
    if not i.is_file():
        continue

    # calculate path relative to `src`,
    # this will make dir/foo/bar into foo/bar
    p = i.relative_to(src)

    # replace path separators with underscore, so foo/bar becomes foo_bar
    target_file_name = str(p).replace('/', '_').replace('\\', '_')

    # then do rename/move. shutil.move will always do the right thing
    # note that it *doesn't* accept Path objects in Python 3.5, so we
    # use str(...) here. `dst` is a path object, and `target_file_name
    # is the name of the file to be placed there; we can use the / operator
    # instead of os.path.join.
    move(str(i), str(dst / target_file_name))