Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What level are fread thread locks on? What level do they need to be on?

Visual Studio's fread "locks out other threads." There is an alternate version _fread_nolock, which reads "without locking other threads", which should only be used "in thread-safe contexts such as single-threaded applications or where the calling scope already handles thread isolation."

Even after reading other somewhat relevant discussions on the two, I'm confused if the locking fread implements is on a specific FILE struct, a specific actual file, or on all fread calls on totally different files.

If you use the nolock versions, what level of locking do you need to provide? Can multiple threads in parallel be reading separate files without any locking? Can multiple threads in parallel be writing separate files without any locking? Or are there global or static variables involved that would be corrupted?

So, by using the nolock versions, are you able to potentially achieve better I/O throughput (if you aren't needlessly moving heads, like reading off separate drives, or a SSD drive), or is the potential gain just reducing redundant locks to a single lock (which should be negligible.)

Does VS' ifstream.read function work just like the regular fread? (I don't see a nolock version of it.)

like image 829
user1902689 Avatar asked Apr 25 '15 21:04

user1902689


1 Answers

The MS standard library implementation fully supports multi-threading. The C++ standard explain this requirement:

27.2.3: Concurrent access to a stream object, stream buffer object, or C Library stream by multiple threads may result in a data race unless otherwise specified.

If one thread makes a library call a that writes a value to a stream and, as a result, another thread reads this value from the stream through a library call b such that this does not result in a data race, then a’s write synchronizes with b’s read.

This means that if you write on a stream, a locking (not file locking, but concurrent access locking to the in-memory stream data structure) is done, to be sure that concurrency is well manageged for all the other threads using the same stream.

This locking overhead is always there, even if not needed. This could have a performance aspect, according to Microsoft:

the performance of the multithreaded libraries has been improved and is close to the performance of the now-eliminated single-threaded libraries. For those situations when even higher performance is required, there are several new features.

This is why _nolock functions are provided. They access the stream directly without thread locking. It must be used with extreme care, for example:

  • if your application is single threaded (another process using the same stream has its own data structure, and OS manageds concurrency here)
  • if you're sure that no two threads use the same stream (for example if you have only one reader thread and writing is done outside your porgramme).
  • if you have other synchronisation mechasnism that protect a critical section of your code. For example, if you use a mutex lock, or an thread safe non blocking algorithm that makes use of atomics.

In such cases, the additional lock for stream access is not needed/redundant. For file intensive functions, it could be worth using the no_lock then.

Note: as you've pointed out: it's only worth using the nolock for intensive file accesses where you make millions of accesses.

like image 93
Christophe Avatar answered Oct 23 '22 01:10

Christophe