Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading file over network slow due to extra reads

I'm reading a file and I either read a row of data (1600 sequential reads of 17 bytes) or a column of data (1600 reads of 17 bytes separated by 1600*17=27,200 bytes). The file is either on a local drive or a remote drive. I do the reads 10 times so I expect in each case to read in 272,000 bytes of data.

On the local drive, I see what I expect. On the remote drive when reading sequentially I also see what I expect but when reading a column, I see a ton of extra reads being done. They are 32,768 bytes long and don't seem to be used but they make the amount of data being read jump from 272,000 bytes to anywhere from 79 MB to 106 MB. Here is the output using Process Monitor:

1:39:39.4624488 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,390,069, Length: 17
1:39:39.4624639 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,390,069, Length: 17
1:39:39.4624838 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,388,032, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:39:39.4633839 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,417,269, Length: 17
1:39:39.4634002 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,417,269, Length: 17
1:39:39.4634178 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,444,469, Length: 17
1:39:39.4634324 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,444,469, Length: 17
1:39:39.4634529 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,441,280, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:39:39.4642199 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,471,669, Length: 17
1:39:39.4642396 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,471,669, Length: 17
1:39:39.4642582 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,498,869, Length: 17
1:39:39.4642764 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,498,869, Length: 17
1:39:39.4642922 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,498,624, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal

Notice the extra reads of 32,768 with I/O Flags set to non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal. These extra reads are what take it from 272 KB to 106 MB and are causing the slowness. They don't happen when reading from a local file or if I'm reading a row so it's all sequential.

I've tried setting the FILE_FLAG_RANDOM_ACCESS but it doesn't seem to help. Any ideas on what is causing these extra reads and how to make them stop???

The tests are being run on a Vista 64 bit system. I can provide source code for a program to demonstrate the problem as well as a console program that does the tests.

like image 418
Brad Pepers Avatar asked Jan 10 '10 20:01

Brad Pepers


2 Answers

You might be running into op lock issues over smb. Typically when reading/saving a file over the network windows will pull over the full file to the client work on it and send back changes. When you are working with flat file databases or files it can cause unnecessary reads across an smb file share.

I'm not sure if there is a way to just pull over the whole file, read the rows from that file on the local copy and then push back the changes or not.

You'll read some nightmares about oplocks and flat file databases.

http://msdn.microsoft.com/en-us/library/aa365433%28VS.85%29.aspx

Not sure if this solves your problem, but it might get you pointed in the right direction. Good luck!

like image 117
Jeremy Heslop Avatar answered Oct 03 '22 04:10

Jeremy Heslop


I found the answer to this. Windows does file reads through the page cache so when I read 17 bytes, it first has to transfer a full page of 32K over and then can copy the 17 bytes I want out of the page cache. Nasty result on performance!

The same thing is actually happening the first time the reads are done on a local file since in that case it does still load a full page at a time into the page cache. But the second time I run the test locally, the files are all already in the page cache so I don't see it. And if SuperFetch is turned on and I've been doing these tests for a while, Windows will start loading the file into the cache before I even run my test application so again I don't see the page reads being done.

So the operating system is doing a lot of things behind the scenes that makes it tough to get good performance testing done!

like image 21
Brad Pepers Avatar answered Oct 03 '22 04:10

Brad Pepers