Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FileStream very slow on application-cold start

A very similar question has also been asked here on SO in case you are interested, but as we will see the accepted answer of that question is not always the case (and it's never the case for my application use-pattern).

The performance determining code consists of FileStream constructor (to open a file) and a SHA1 hash (the .Net framework implementation). The code is pretty much C# version of what was asked in the question I've linked to above.

Case 1: The Application is started either for the first time or Nth time, but with different target file set. The application is now told to compute the hash values on the files that were never accessed before.

  • ~50ms
  • 80% FileStream constructor
  • 18% hash computation

Case 2: Application is now fully terminated, and started again, asked to compute hash on the same files:

  • ~8ms
  • 90% hash computation
  • 8% FileStream constructor

Problem
My application is always in use Case 1. It will never be asked to re-compute a hash on a file that was already visited once.

So my rate-determining step is FileStream Constructor! Is there anything I can do to speed up this use case?

Thank you.

P.S. Stats were gathered using JetBrains profiler.

like image 462
Alex K Avatar asked Nov 02 '09 20:11

Alex K


2 Answers

The file system and or disk controller will cache recently accessed files / sectors.

The rate-determining step is reading the file, not constructing a FileStream object, and it's completely normal that it will be significantly faster on the second run when data is in the cache.

like image 21
Joe Avatar answered Sep 22 '22 23:09

Joe


... but with different target file set.

Key phrase, your app will not be able to take advantage of the file system cache. Like it did in the second measurement. The directory info can't come from RAM because it wasn't read yet, the OS always has to fall back to the disk drive and that is slow.

Only better hardware can speed it up. 50 msec is about the standard amount of time needed for a spindle drive, 20 msec is about as low as such drives can go. Reader head seek time is the hard mechanical limit. That's easy to beat today, SSD is widely available and reasonably affordable. The only problem with it is that when you got used to it then you never move back :)

like image 68
Hans Passant Avatar answered Sep 21 '22 23:09

Hans Passant