Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python script to concatenate all the files in the directory into one file

Tags:

python

file

copy

I have written the following script to concatenate all the files in the directory into one single file.

Can this be optimized, in terms of

  1. idiomatic python

  2. time

Here is the snippet:

import time, glob

outfilename = 'all_' + str((int(time.time()))) + ".txt"

filenames = glob.glob('*.txt')

with open(outfilename, 'wb') as outfile:
    for fname in filenames:
        with open(fname, 'r') as readfile:
            infile = readfile.read()
            for line in infile:
                outfile.write(line)
            outfile.write("\n\n")
like image 679
user1629366 Avatar asked Jul 19 '13 15:07

user1629366


People also ask

How do I combine multiple text files into one?

Two quick options for combining text files.Open the two files you want to merge. Select all text (Command+A/Ctrl+A) from one document, then paste it into the new document (Command+V/Ctrl+V). Repeat steps for the second document. This will finish combining the text of both documents into one.

How do I read all files in a directory in Python?

os. listdir() method in python is used to get the list of all files and directories in the specified directory. If we don't specify any directory, then list of files and directories in the current working directory will be returned.


1 Answers

Use shutil.copyfileobj to copy data:

import shutil

with open(outfilename, 'wb') as outfile:
    for filename in glob.glob('*.txt'):
        if filename == outfilename:
            # don't want to copy the output into the output
            continue
        with open(filename, 'rb') as readfile:
            shutil.copyfileobj(readfile, outfile)

shutil reads from the readfile object in chunks, writing them to the outfile fileobject directly. Do not use readline() or a iteration buffer, since you do not need the overhead of finding line endings.

Use the same mode for both reading and writing; this is especially important when using Python 3; I've used binary mode for both here.

like image 133
Martijn Pieters Avatar answered Oct 13 '22 22:10

Martijn Pieters