Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UNC path pointing to local directory much slower than local access

Some code I'm working with occasionally needs to refer to long UNC paths (e.g. \\?\UNC\MachineName\Path), but we've discovered that no matter where the directory is located, even on the same machine, it's much slower when accessing through the UNC path than the local path.

For example, we've written some benchmarking code that writes a string of gibberish to a file, then later read it back, multiple times. I'm testing it with 6 different ways to access the same shared directory on my dev machine, with the code running on the same machine:

  • C:\Temp
  • \\MachineName\Temp
  • \\?\C:\Temp
  • \\?\UNC\MachineName\Temp
  • \\127.0.0.1\Temp
  • \\?\UNC\127.0.0.1\Temp

And here are the results:

Testing: C:\Temp
Wrote 1000 files to C:\Temp in 861.0647 ms
Read 1000 files from C:\Temp in 60.0744 ms
Testing: \\MachineName\Temp
Wrote 1000 files to \\MachineName\Temp in 2270.2051 ms
Read 1000 files from \\MachineName\Temp in 1655.0815 ms
Testing: \\?\C:\Temp
Wrote 1000 files to \\?\C:\Temp in 916.0596 ms
Read 1000 files from \\?\C:\Temp in 60.0517 ms
Testing: \\?\UNC\MachineName\Temp
Wrote 1000 files to \\?\UNC\MachineName\Temp in 2499.3235 ms
Read 1000 files from \\?\UNC\MachineName\Temp in 1684.2291 ms
Testing: \\127.0.0.1\Temp
Wrote 1000 files to \\127.0.0.1\Temp in 2516.2847 ms
Read 1000 files from \\127.0.0.1\Temp in 1721.1925 ms
Testing: \\?\UNC\127.0.0.1\Temp
Wrote 1000 files to \\?\UNC\127.0.0.1\Temp in 2499.3211 ms
Read 1000 files from \\?\UNC\127.0.0.1\Temp in 1678.18 ms

I tried the IP address to rule out a DNS issue. Could it be checking credentials or permissions on each file access? If so, is there a way to cache it? Does it just assume since it's a UNC path that it should do everything over TCP/IP instead of directly accessing the disk? Is it something wrong with the code we're using for the reads/writes? I've ripped out the pertinent parts for benchmarking, seen below:

using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.InteropServices;
using System.Text;
using Microsoft.Win32.SafeHandles;
using Util.FileSystem;

namespace UNCWriteTest {
    internal class Program {
        [DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
        public static extern bool DeleteFile(string path); // File.Delete doesn't handle \\?\UNC\ paths

        private const int N = 1000;

        private const string TextToSerialize =
            "asd;lgviajsmfopajwf0923p84jtmpq93worjgfq0394jktp9orgjawefuogahejngfmliqwegfnailsjdhfmasodfhnasjldgifvsdkuhjsmdofasldhjfasolfgiasngouahfmp9284jfqp92384fhjwp90c8jkp04jk34pofj4eo9aWIUEgjaoswdfg8jmp409c8jmwoeifulhnjq34lotgfhnq34g";

        private static readonly byte[] _Buffer = Encoding.UTF8.GetBytes(TextToSerialize);

        public static string WriteFile(string basedir) {
            string fileName = Path.Combine(basedir, string.Format("{0}.tmp", Guid.NewGuid()));

            try {
                IntPtr writeHandle = NativeFileHandler.CreateFile(
                    fileName,
                    NativeFileHandler.EFileAccess.GenericWrite,
                    NativeFileHandler.EFileShare.None,
                    IntPtr.Zero,
                    NativeFileHandler.ECreationDisposition.New,
                    NativeFileHandler.EFileAttributes.Normal,
                    IntPtr.Zero);

                // if file was locked
                int fileError = Marshal.GetLastWin32Error();
                if ((fileError == 32 /* ERROR_SHARING_VIOLATION */) || (fileError == 80 /* ERROR_FILE_EXISTS */)) {
                    throw new Exception("oopsy");
                }

                using (var h = new SafeFileHandle(writeHandle, true)) {
                    using (var fs = new FileStream(h, FileAccess.Write, NativeFileHandler.DiskPageSize)) {
                        fs.Write(_Buffer, 0, _Buffer.Length);
                    }
                }
            }
            catch (IOException) {
                throw;
            }
            catch (Exception ex) {
                throw new InvalidOperationException(" code " + Marshal.GetLastWin32Error(), ex);
            }

            return fileName;
        }

        public static void ReadFile(string fileName) {
            var fileHandle =
                new SafeFileHandle(
                    NativeFileHandler.CreateFile(fileName, NativeFileHandler.EFileAccess.GenericRead, NativeFileHandler.EFileShare.Read, IntPtr.Zero,
                                                 NativeFileHandler.ECreationDisposition.OpenExisting, NativeFileHandler.EFileAttributes.Normal, IntPtr.Zero), true);

            using (fileHandle) {
                //check the handle here to get a bit cleaner exception semantics
                if (fileHandle.IsInvalid) {
                    //ms-help://MS.MSSDK.1033/MS.WinSDK.1033/debug/base/system_error_codes__0-499_.htm
                    int errorCode = Marshal.GetLastWin32Error();
                    //now that we've taken more than our allotted share of time, throw the exception
                    throw new IOException(string.Format("file read failed on {0} to {1} with error code {1}", fileName, errorCode));
                }

                //we have a valid handle and can actually read a stream, exceptions from serialization bubble out
                using (var fs = new FileStream(fileHandle, FileAccess.Read, 1*NativeFileHandler.DiskPageSize)) {
                    //if serialization fails, we'll just let the normal serialization exception flow out
                    var foo = new byte[256];
                    fs.Read(foo, 0, 256);
                }
            }
        }

        public static string[] TestWrites(string baseDir) {
            try {
                var fileNames = new List<string>();
                DateTime start = DateTime.UtcNow;
                for (int i = 0; i < N; i++) {
                    fileNames.Add(WriteFile(baseDir));
                }
                DateTime end = DateTime.UtcNow;

                Console.Out.WriteLine("Wrote {0} files to {1} in {2} ms", N, baseDir, end.Subtract(start).TotalMilliseconds);
                return fileNames.ToArray();
            }
            catch (Exception e) {
                Console.Out.WriteLine("Failed to write for " + baseDir + " Exception: " + e.Message);
                return new string[] {};
            }
        }

        public static void TestReads(string baseDir, string[] fileNames) {
            try {
                DateTime start = DateTime.UtcNow;

                for (int i = 0; i < N; i++) {
                    ReadFile(fileNames[i%fileNames.Length]);
                }
                DateTime end = DateTime.UtcNow;

                Console.Out.WriteLine("Read {0} files from {1} in {2} ms", N, baseDir, end.Subtract(start).TotalMilliseconds);
            }
            catch (Exception e) {
                Console.Out.WriteLine("Failed to read for " + baseDir + " Exception: " + e.Message);
            }
        }

        private static void Main(string[] args) {
            foreach (string baseDir in args) {
                Console.Out.WriteLine("Testing: {0}", baseDir);

                string[] fileNames = TestWrites(baseDir);

                TestReads(baseDir, fileNames);

                foreach (string fileName in fileNames) {
                    DeleteFile(fileName);
                }
            }
        }
    }
}
like image 705
Chris Doggett Avatar asked Oct 15 '12 19:10

Chris Doggett


1 Answers

This doesn't surprise me. You're writing/reading a fairly small amount of data, so the file system cache is probably minimizing the impact of the physical disk I/O; basically, the bottleneck is going to be the CPU. I'm not certain whether the traffic will be going via the TCP/IP stack or not but at a minimum the SMB protocol is involved. For one thing that means the requests are being passed back and forth between the SMB client process and the SMB server process, so you've got context switching between three distinct processes, including your own. Using the local file system path you're switching into kernel mode and back but no other process is involved. Context switching is much slower than the transition to and from kernel mode.

There are likely to be two distinct additional overheads, one per file and one per kilobyte of data. In this particular test the per file SMB overhead is likely to be dominant. Because the amount of data involved also affects the impact of physical disk I/O, you may find that this is only really a problem when dealing with lots of small files.

like image 199
Harry Johnston Avatar answered Oct 21 '22 21:10

Harry Johnston