Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make CreateFile as fast as possible

I need to read the contents of several thousands of small files at startup. On linux, just using fopen and reading is very fast. On Windows, this happens very slowly.

I have switched to using Overlapped I/O (Asynchronous I/O) using ReadFileEx, where Windows does a callback when data is ready to read.

However, the actual thousands of calls to CreateFile itself are still a bottleneck. Note that I supply my own buffers, turn on the NO_BUFFERING flag, give the SERIAL hint, etc. However, the calls to CreateFile take several 10s of seconds, whereas on linux everything is done much faster.

Is there anything that can be done to get these files ready for reading more quickly?

The call to CreateFile is:

            hFile = CreateFile(szFullFileName,
                GENERIC_READ,
                FILE_SHARE_READ | FILE_SHARE_WRITE,
                NULL,
                OPEN_EXISTING,
                FILE_ATTRIBUTE_NORMAL | FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING | FILE_FLAG_SEQUENTIAL_SCAN,
                NULL);
like image 474
Macker Avatar asked Feb 03 '23 13:02

Macker


1 Answers

CreateFile in kernel32.dll has some extra overhead compared to the kernel syscall NtCreateFile in ntdll.dll. This is the real function that CreateFile calls to ask the kernel to open the file. If you need to open a large number of files, NtOpenFile will be more efficient by avoiding the special cases and path translation that Win32 has-- things that wouldn't apply to a bunch of files in a directory anyway.

NTSYSAPI NTSTATUS NTAPI NtOpenFile(OUT HANDLE *FileHandle, IN ACCESS_MASK DesiredAccess, IN OBJECT_ATTRIBUTES *ObjectAttributes, OUT IO_STATUS_BLOCK *IoStatusBlock, IN ULONG ShareAccess, IN ULONG OpenOptions);

HANDLE Handle;
OBJECT_ATTRIBUTES Oa = {0};
UNICODE_STRING Name_U;
IO_STATUS_BLOCK IoSb;

RtlInitUnicodeString(&Name_U, Name);

Oa.Length = sizeof Oa;
Oa.ObjectName = &Name_U;
Oa.Attributes = CaseInsensitive ? OBJ_CASE_INSENSITIVE : 0;
Oa.RootDirectory = ParentDirectoryHandle;

Status = NtOpenFile(&Handle, FILE_READ_DATA, &Oa, &IoSb, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, FILE_SEQUENTIAL_ONLY);

Main downside: this API is not supported by Microsoft for use in user mode. That said, the equivalent function is documented for kernel mode use and hasn't changed since the first release of Windows NT in 1993.

NtOpenFile also allows you to open a file relative to an existing directory handle (ParentDirectoryHandle in the example) which should cut down on some of the filesystem overhead in locating the directory.

In the end, NTFS may just be too slow in handling directories with large numbers of files as Carey Gregory said.

like image 137
Chris Smith Avatar answered Feb 05 '23 03:02

Chris Smith