It takes at least 3 times longer to copy files with shutil.copyfile()
versus to a regular right-click-copy > right-click-paste using Windows File Explorer or Mac's Finder. Is there any faster alternative to shutil.copyfile()
in Python? What could be done to speed up a file copying process? (The files destination is on the network drive... if it makes any difference...).
Here is what I have ended up with:
def copyWithSubprocess(cmd): proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) win=mac=False if sys.platform.startswith("darwin"):mac=True elif sys.platform.startswith("win"):win=True cmd=None if mac: cmd=['cp', source, dest] elif win: cmd=['xcopy', source, dest, '/K/O/X'] if cmd: copyWithSubprocess(cmd)
Robocopy (Robust File Copy) It makes it much easier and faster, especially over a network. To use Robocopy, open Start, type Command Prompt and click on “Command Prompt” from the search results. You can also right-click Start and select “Windows PowerShell.” In either method, type the command: robocopy /?
The fastest version w/o overoptimizing the code I've got with the following code:
class CTError(Exception): def __init__(self, errors): self.errors = errors try: O_BINARY = os.O_BINARY except: O_BINARY = 0 READ_FLAGS = os.O_RDONLY | O_BINARY WRITE_FLAGS = os.O_WRONLY | os.O_CREAT | os.O_TRUNC | O_BINARY BUFFER_SIZE = 128*1024 def copyfile(src, dst): try: fin = os.open(src, READ_FLAGS) stat = os.fstat(fin) fout = os.open(dst, WRITE_FLAGS, stat.st_mode) for x in iter(lambda: os.read(fin, BUFFER_SIZE), ""): os.write(fout, x) finally: try: os.close(fin) except: pass try: os.close(fout) except: pass def copytree(src, dst, symlinks=False, ignore=[]): names = os.listdir(src) if not os.path.exists(dst): os.makedirs(dst) errors = [] for name in names: if name in ignore: continue srcname = os.path.join(src, name) dstname = os.path.join(dst, name) try: if symlinks and os.path.islink(srcname): linkto = os.readlink(srcname) os.symlink(linkto, dstname) elif os.path.isdir(srcname): copytree(srcname, dstname, symlinks, ignore) else: copyfile(srcname, dstname) # XXX What about devices, sockets etc.? except (IOError, os.error), why: errors.append((srcname, dstname, str(why))) except CTError, err: errors.extend(err.errors) if errors: raise CTError(errors)
This code runs a little bit slower than native linux "cp -rf".
Comparing to shutil the gain for the local storage to tmfps is around 2x-3x and around than 6x for NFS to local storage.
After profiling I've noticed that shutil.copy does lots of fstat syscals which are pretty heavyweight. If one want to optimize further I would suggest to do a single fstat for src and reuse the values. Honestly I didn't go further as I got almost the same figures as native linux copy tool and optimizing for several hundrends of milliseconds wasn't my goal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With