Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

System.IO.FileSystemWatcher to monitor a network-server folder - Performance considerations

I want to watch a folder tree on a network server for changes. The files all have a specific extension. There are about 200 folders in the tree and about 1200 files with the extension I am watching.

I can't write a service to run on the server (off-limits!) so the solution has to be local to the client. Timeliness is not particularly important. I can live with a minute or more delay in notifications. I am watching for Create, Delete, Rename and Changes.

Would using the .NET System.IO.fileSystemWatcher create much of a load on the server?

How about 10 separate watchers to cut down the number of folders/files being watched? (down to 200 from 700 folders, 1200 from 5500 files in total) More network traffic instead of less? My thoughts are a reshuffle on the server to put the watched files under 1 tree. I may not always have this option hence the team of watchers.

I suppose the other solution is a periodic check if the FSW creates an undue load on the server, or if it doesn't work for a whole bunch of SysAdmin type reasons.

Is there a better way to do this?

like image 761
CAD bloke Avatar asked Sep 30 '08 04:09

CAD bloke


People also ask

Which of the following are types of changes that can be detected by the FileSystemWatcher?

The FileSystemWatcher lets you detect several types of changes in a directory or file, like the 'LastWrite' date and time, changes in the size of files or directories etc. It also helps you detect if a file or directory is deleted, renamed or created.

What is file watcher?

File Watcher is an IntelliJ IDEA tool that allows you to automatically run a command-line tool like compilers, formatters, or linters when you change or save a file in the IDE.

What is the parameter that has to be used when a file is renamed after setting a FileSystemWatcher on the file?

If we want to notify the administrator, that a file has been changed, we would require the name of the file to which the change has been made. To get the name of the file we use the event handler argument FileSystemEventArgs.


1 Answers

From a server load point of view, using the IO.FileSystemWatcher for remote change notifications in the scenario you describe is probably the most efficient method possible. It uses the FindFirstChangeNotification and ReadDirectoryChangesW Win32 API functions internally, which in turn communicate with the network redirector in an optimized way (assuming standard Windows networking: if a third-party redirector is used, and it doesn't support the required functionality, things won't work at all). The .NET wrapper also uses async I/O and everything, further assuring maximum efficiency.

The only problem with this solution is that it's not very reliable. Other than having to deal with network connections going away temporarily (which isn't too much of a problem, since IO.FileSystemWatcher will fire an error event in this case which you can handle), the underlying mechanism has certain fundamental limitations. From the MSDN documentation for the Win32 API functions:

  • ReadDirectoryChangesW fails with ERROR_INVALID_PARAMETER when the buffer length is greater than 64 KB and the application is monitoring a directory over the network. This is due to a packet size limitation with the underlying file sharing protocols

  • Notifications may not be returned when calling FindFirstChangeNotification for a remote file system

In other words: under high load (when you would need a large buffer) or, worse, under random unspecified circumstances, you may not get the notifications you expect. This is even an issue with local file system watchers, but it's much more of a problem over the network. Another question here on SO details the inherent reliability problems with the API in a bit more detail.

When using file system watchers, your application should be able to deal with these limitations. For example:

  • If the files you're looking for have sequence numbers, store the last sequence number you got notified about, so you can look for 'gaps' on future notifications and process the files for which you didn't get notified;

  • On receiving a notification, always do a full directory scan. This may sound really bad, but since the scan is event-driven, it's still much more efficient than dumb polling. Also, as long as you keep the total number of files in a single directory, as well as the number of directories to scan, under a thousand or so, the impact of this operation on performance should be pretty minimal anyway.

Setting up multiple listeners is something you should avoid as much as possible: if anything, this will make things even less reliable...

Anyway, if you absolutely have to use file system watchers, things can work OK as long as you're aware of the limitations, and don't expect 1:1 notification for every file modified/created.

So, if you have other options (essentially, having the process writing the files notify you in a non-file system based way: any regular RPC method will be an improvement...), those are definitely worth looking into from a reliability point of view.

like image 65
mdb Avatar answered Sep 24 '22 05:09

mdb