Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get progress back from shutil file copy thread

Tags:

I've got an application from which a file is copied from src to dst:

import shutil from threading import Thread  t = Thread(target=shutil.copy, args=[ src, dst ]).start() 

I wish to have the application query the progress of the copy every 5 seconds without locking up the application itself. Is this possible?

My intention is to set this progress to a QtGui.QLabel to give the user feedback on the file copy.

Can this be achieved when copying using a threaded shutil file copy?

like image 858
fredrik Avatar asked Apr 30 '15 12:04

fredrik


2 Answers

shutil.copy() doesn't offer any options to track the progress, no. At most you could monitor the size of the destination file (using os.* functions on the target filename).

The alternative would be to implement your own copy function. The implementation is really quite simple; shutil.copy() is basically a shutil.copyfile() plus shutil.copymode() call; shutil.copyfile() in turn delegates the real work to shutil.copyfileobj()* (links to the Python 3.8.2 source code).

Implementing your own shutil.copyfileobj() to include progress should be trivial; inject support for a callback function to report inform your program each time another block has copied:

import os import shutil  def copyfileobj(fsrc, fdst, callback, length=0):     try:         # check for optimisation opportunity         if "b" in fsrc.mode and "b" in fdst.mode and fsrc.readinto:             return _copyfileobj_readinto(fsrc, fdst, callback, length)     except AttributeError:         # one or both file objects do not support a .mode or .readinto attribute         pass      if not length:         length = shutil.COPY_BUFSIZE      fsrc_read = fsrc.read     fdst_write = fdst.write      copied = 0     while True:         buf = fsrc_read(length)         if not buf:             break         fdst_write(buf)         copied += len(buf)         callback(copied)  # differs from shutil.COPY_BUFSIZE on platforms != Windows READINTO_BUFSIZE = 1024 * 1024  def _copyfileobj_readinto(fsrc, fdst, callback, length=0):     """readinto()/memoryview() based variant of copyfileobj().     *fsrc* must support readinto() method and both files must be     open in binary mode.     """     fsrc_readinto = fsrc.readinto     fdst_write = fdst.write      if not length:         try:             file_size = os.stat(fsrc.fileno()).st_size         except OSError:             file_size = READINTO_BUFSIZE         length = min(file_size, READINTO_BUFSIZE)      copied = 0     with memoryview(bytearray(length)) as mv:         while True:             n = fsrc_readinto(mv)             if not n:                 break             elif n < length:                 with mv[:n] as smv:                     fdst.write(smv)             else:                 fdst_write(mv)             copied += n             callback(copied) 

and then, in the callback, compare the copied size with the file size.

Note that in the above implementation we look for the opportunity to use a different method for binary files, where you can use fileobj.readinto() and a memoryview object to avoid redundant data copying; see the original _copyfileobj_readinto() implementation for comparison.


* footnote to … delegates the real work to shutil.copyfileobj(): As of Python 3.8, on OS X and Linux the copyfile() implementation delegates file copying to OS-specific, optimised system calls (to fcopyfile() and sendfile(), respectively) but these calls have no hooks whatsoever to track progress, and so if you need to track progress you'd want to disable these delegation paths anyway. On Windows the code uses the aforementioned _copyfileobj_readinto() function.

like image 73
Martijn Pieters Avatar answered Oct 27 '22 01:10

Martijn Pieters


I combined Martijn Pieters answer with some progress bar code from this answer with modifications to work in PyCharm from this answer which gives me the following. The function copy_with_progress was my goal.

import os import shutil   def progress_percentage(perc, width=None):     # This will only work for python 3.3+ due to use of     # os.get_terminal_size the print function etc.      FULL_BLOCK = '█'     # this is a gradient of incompleteness     INCOMPLETE_BLOCK_GRAD = ['░', '▒', '▓']      assert(isinstance(perc, float))     assert(0. <= perc <= 100.)     # if width unset use full terminal     if width is None:         width = os.get_terminal_size().columns     # progress bar is block_widget separator perc_widget : ####### 30%     max_perc_widget = '[100.00%]' # 100% is max     separator = ' '     blocks_widget_width = width - len(separator) - len(max_perc_widget)     assert(blocks_widget_width >= 10) # not very meaningful if not     perc_per_block = 100.0/blocks_widget_width     # epsilon is the sensitivity of rendering a gradient block     epsilon = 1e-6     # number of blocks that should be represented as complete     full_blocks = int((perc + epsilon)/perc_per_block)     # the rest are "incomplete"     empty_blocks = blocks_widget_width - full_blocks      # build blocks widget     blocks_widget = ([FULL_BLOCK] * full_blocks)     blocks_widget.extend([INCOMPLETE_BLOCK_GRAD[0]] * empty_blocks)     # marginal case - remainder due to how granular our blocks are     remainder = perc - full_blocks*perc_per_block     # epsilon needed for rounding errors (check would be != 0.)     # based on reminder modify first empty block shading     # depending on remainder     if remainder > epsilon:         grad_index = int((len(INCOMPLETE_BLOCK_GRAD) * remainder)/perc_per_block)         blocks_widget[full_blocks] = INCOMPLETE_BLOCK_GRAD[grad_index]      # build perc widget     str_perc = '%.2f' % perc     # -1 because the percentage sign is not included     perc_widget = '[%s%%]' % str_perc.ljust(len(max_perc_widget) - 3)      # form progressbar     progress_bar = '%s%s%s' % (''.join(blocks_widget), separator, perc_widget)     # return progressbar as string     return ''.join(progress_bar)   def copy_progress(copied, total):     print('\r' + progress_percentage(100*copied/total, width=30), end='')   def copyfile(src, dst, *, follow_symlinks=True):     """Copy data from src to dst.      If follow_symlinks is not set and src is a symbolic link, a new     symlink will be created instead of copying the file it points to.      """     if shutil._samefile(src, dst):         raise shutil.SameFileError("{!r} and {!r} are the same file".format(src, dst))      for fn in [src, dst]:         try:             st = os.stat(fn)         except OSError:             # File most likely does not exist             pass         else:             # XXX What about other special files? (sockets, devices...)             if shutil.stat.S_ISFIFO(st.st_mode):                 raise shutil.SpecialFileError("`%s` is a named pipe" % fn)      if not follow_symlinks and os.path.islink(src):         os.symlink(os.readlink(src), dst)     else:         size = os.stat(src).st_size         with open(src, 'rb') as fsrc:             with open(dst, 'wb') as fdst:                 copyfileobj(fsrc, fdst, callback=copy_progress, total=size)     return dst   def copyfileobj(fsrc, fdst, callback, total, length=16*1024):     copied = 0     while True:         buf = fsrc.read(length)         if not buf:             break         fdst.write(buf)         copied += len(buf)         callback(copied, total=total)   def copy_with_progress(src, dst, *, follow_symlinks=True):     if os.path.isdir(dst):         dst = os.path.join(dst, os.path.basename(src))     copyfile(src, dst, follow_symlinks=follow_symlinks)     shutil.copymode(src, dst)     return dst 
like image 27
flutefreak7 Avatar answered Oct 27 '22 01:10

flutefreak7