Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking if folder has files

Tags:

c#

directory

I have program which writes to database which folders are full or empty. Now I'm using

bool hasFiles=false;
(Directory.GetFiles(path).Length >0) ? hasFiles=true: hasFiles=false;

but it takes almost one hour, and I can't do anything in this time.

Is there any fastest way to check if folder has any file ?

like image 757
user278618 Avatar asked Apr 26 '10 10:04

user278618


2 Answers

To check if any files exists inside the directory or sub directories, in .net 4, you can use method below:

public bool isDirectoryContainFiles(string path) {
    if (!Directory.Exists(path)) return false;
    return Directory.EnumerateFiles(path, "*", SearchOption.AllDirectories).Any();
}
like image 158
Leng Weh Seng Avatar answered Oct 07 '22 07:10

Leng Weh Seng


The key to speeding up such a cross-network search is to cut down the number of requests across the network. Rather than getting all the directories, and then checking each for files, try and get everything from one call.

In .NET 3.5 there is no one method to recursively get all files and folders, so you have to build it yourself (see below). In .NET 4 new overloads exist to to this in one step.

Using DirectoryInfo one also gets information on whether the returned name is a file or directory, which cuts down calls as well.

This means splitting a list of all the directories and files becomes something like this:

struct AllDirectories {
  public List<string> DirectoriesWithoutFiles { get; set; }
  public List<string> DirectoriesWithFiles { get; set; }
}

static class FileSystemScanner {
  public AllDirectories DivideDirectories(string startingPath) {
    var startingDir = new DirectoryInfo(startingPath);

    // allContent IList<FileSystemInfo>
    var allContent = GetAllFileSystemObjects(startingDir);
    var allFiles = allContent.Where(f => !(f.Attributes & FileAttributes.Directory))
                             .Cast<FileInfo>();
    var dirs = allContent.Where(f => (f.Attributes & FileAttributes.Directory))
                         .Cast<DirectoryInfo>();
    var allDirs = new SortedList<DirectoryInfo>(dirs, new FileSystemInfoComparer());

    var res = new AllDirectories {
      DirectoriesWithFiles = new List<string>()
    };
    foreach (var file in allFiles) {
      var dirName = Path.GetDirectoryName(file.Name);
      if (allDirs.Remove(dirName)) {
        // Was removed, so first time this dir name seen.
        res.DirectoriesWithFiles.Add(dirName);
      }
    }
    // allDirs now just contains directories without files
    res.DirectoriesWithoutFiles = new List<String>(addDirs.Select(d => d.Name));
  }

  class FileSystemInfoComparer : IComparer<FileSystemInfo> {
    public int Compare(FileSystemInfo l, FileSystemInfo r) {
      return String.Compare(l.Name, r.Name, StringComparison.OrdinalIgnoreCase);
    }
  }
}

Implementing GetAllFileSystemObjects depends on the .NET version. On .NET 4 it is very easy:

ILIst<FileSystemInfo> GetAllFileSystemObjects(DirectoryInfo root) {
  return root.GetFileSystemInfos("*.*", SearchOptions.AllDirectories);
}

On earlier versions a little more work is needed:

ILIst<FileSystemInfo> GetAllFileSystemObjects(DirectoryInfo root) {
  var res = new List<FileSystemInfo>();
  var pending = new Queue<DirectoryInfo>(new [] { root });

  while (pending.Count > 0) {
    var dir = pending.Dequeue();
    var content = dir.GetFileSystemInfos();
    res.AddRange(content);
    foreach (var dir in content.Where(f => (f.Attributes & FileAttributes.Directory))
                               .Cast<DirectoryInfo>()) {
      pending.Enqueue(dir);
    }
  }

  return res;
}

This approach calls into the filesystem as few times as possible, just once on .NET 4 or once per directory on earlier versions, allowing the network client and server to minimise the number of underlying filesystem calls and network round trips.

Getting FileSystemInfo instances has the disadvantage of needing multiple file system operations (I believe this is somewhat OS dependent), but for each name any solution needs to know if it is a file or directory so this is not avoidable at some level (without resorting to P/Invoke of FindFileFirst/FindNextFile/FindClose).


Aside, the above would be easier with a partition extension method:

Tuple<IEnumerable<T>,IEnumerable<T>> Extensions.Partition<T>(
                                                 this IEnumerable<T> input,
                                                 Func<T,bool> parition);

Writing that to be lazy would be an interesting exercise (only consuming input when something iterates over one of the outputs, while buffering the other).

like image 28
Richard Avatar answered Oct 07 '22 06:10

Richard