Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python watchdog windows wait till copy finishes

I am using the Python watchdog module on a Windows 2012 server to monitor new files appearing on a shared drive. When watchdog notices the new file it kicks off a database restore process.

However, it seems that watchdog will attempt to restore the file the second it is created and not wait till the file has finished copying to the shared drive. So I changed the event to on_modified but there are two on_modified events, one when the file is initially being copied and one when it is finished being copied.

How can I handle the two on_modified events to only fire when the file being copied to the shared drive has finished?

What happens when multiple files are copied to the shared drive at the same time?

Here is my code

import time
import subprocess
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

class NewFile(FileSystemEventHandler):
    def process(self, event):
        if event.is_directory:
            return

    if event.event_type == 'modified':            
        if getext(event.src_path) == 'gz':
            load_pgdump(event.src_path)

    def on_modified(self, event):
        self.process(event)

def getext(filename):
    "Get the file extension"
    file_ext = filename.split(".",1)[1]
    return file_ext

def load_pgdump(src_path):    
    restore = 'pg_restore command ' + src_path
    subprocess.call(restore, shell=True)

def main():
    event_handler = NewFile()
    observer = Observer()
    observer.schedule(event_handler, path='Y:\\', recursive=True)
    observer.start()

    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observer.stop()
    observer.join()

if __name__ == '__main__':
    main()
like image 514
tjmgis Avatar asked Aug 19 '15 10:08

tjmgis


4 Answers

On linux you also get close event. Than solution would be to wait with processing file until file gets closed. My approach would be to add on_closed handling.

class Handler(FileSystemEventHandler):
    def __init__(self):
        self.files_to_process = set()

    def dispatch(self, event):
        _method_map = {
            'created': self.on_created,
            'closed': self.on_closed
        }

    def on_created(self, event):
        self.files_to_process.add(event.src_path)

    def on_closed(self, event):
        self.files_to_process.remove(event.src_path)
        actual_processing(event.src_path)
like image 77
ravenwing Avatar answered Dec 01 '22 00:12

ravenwing


In your on_modified event, just wait until the file is finished being copied, via watching the filesize.

Offering a Simpler Loop:

historicalSize = -1
while (historicalSize != os.path.getsize(filename)):
  historicalSize = os.path.getsize(filename)
  time.sleep(1)
print "file copy has now finished"
like image 38
Mtl Dev Avatar answered Nov 30 '22 23:11

Mtl Dev


I'm using following code to wait until file copied (for Windows only):

from ctypes import windll
import time

def is_file_copy_finished(file_path):
    finished = False

    GENERIC_WRITE         = 1 << 30
    FILE_SHARE_READ       = 0x00000001
    OPEN_EXISTING         = 3
    FILE_ATTRIBUTE_NORMAL = 0x80

    if isinstance(file_path, str):
        file_path_unicode = file_path.decode('utf-8')
    else:
        file_path_unicode = file_path

    h_file = windll.Kernel32.CreateFileW(file_path_unicode, GENERIC_WRITE, FILE_SHARE_READ, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)

    if h_file != -1:
        windll.Kernel32.CloseHandle(h_file)
        finished = True

    print 'is_file_copy_finished: ' + str(finished)
    return finished

def wait_for_file_copy_finish(file_path):
    while not is_file_copy_finished(file_path):
        time.sleep(0.2)

wait_for_file_copy_finish(r'C:\testfile.txt')

The idea is to try open a file for write with share read mode. It will fail if someone else is writing to it.

Enjoy ;)

like image 23
Dmytro Avatar answered Dec 01 '22 00:12

Dmytro


I would add a comment as this isn't an answer to your question but a different approach... but I don't have enough rep yet. You could try monitoring filesize, if it stops changing you can assume copy has finished:

copying = True
size2 = -1
while copying:
    size = os.path.getsize('name of file being copied')
    if size == size2:
        break
    else:
        size2 = os.path.getsize('name of file being copied')
        time.sleep(2)
like image 34
iri Avatar answered Dec 01 '22 00:12

iri