Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Risk Assessment: Using Pthreads (vs. GCD or NSThread)

A colleague suggested recently that I use pthreads instead of GCD because it's, "way faster." I don't disagree that it's faster, but what's the risk with pthreads?

My feeling is that they will ultimately not be anywhere nearly as idiot-proof as GCD (and my team of one is 50% idiots). Are pthreads hard to get right?

like image 465
Dan Rosenstark Avatar asked Jan 05 '13 23:01

Dan Rosenstark


3 Answers

GCD and pthreads are both ways of doing work asynchronously, but they are significantly different. Most descriptions of GCD describe it in terms of threads and of thread pooling, but as DrPizza puts it

to concentrate on [threads and thread pools] is to miss the point. GCD’s value lies not in thread pooling, but in queuing.
                                                                Grand Central Dispatch for Win32: why I want it

GCD has some nice benefits over APIs like pthreads.

  • GCD does more to encourage and support "islands of serialization in a sea of parallelism." GCD makes it easy to avoid a lot of locks and mutexes and condition variables that are the normal way of comunicating between threads. This is because you decompose your program into tasks and GCD handles getting the task input and output to the appropriate thread behind the scenes. So programming with GCD allows you to pretty much write serially and not worry too much about stuff people often worry about in threaded code. That makes the code simpler and less bug prone.

  • GCD can do scaling for you so the program uses as much parallelism as the dependencies between the tasks you've decomposed your program into and the hardware allow for. Of course designing the program to be scalable is generally the hard bit, but you'll still need something to actually take advantage of that work to run as much as possible in parallel. Work stealing schedulers like GCD do that part.

  • GCD is composable. If you explicitly spawn threads for things you want to do asynchronously or in parallel you can run into a problem when libraries you use do the same thing. Say you decide you can run eight threads simultaneously because that's how many threads will be effective for your program given the machine it runs on. And then say a library you use on each thread does the same thing. Now you could have up to 64 threads running at once, which is more than you know is effective for your program.

    Thread pooling solves this but everyone needs to use the same thread pool. GCD uses thread pooling internally and provides the same pool to everyone.

  • GCD provides a bunch of 'sources' and makes it easy to write an event driven program that depends on or takes input from the sources. For example you can very easily have a queue set up to launch a task every time data is available to read on a network socket, or when a timer fires, or whatever.

like image 179
bames53 Avatar answered Sep 20 '22 11:09

bames53


I don't think they're hard to get right, but having worked with many different approaches over the years (pthreads, GCD, NSThread, NSOperationQueue, etc.) I have no evidence to support an assertion like "pthreads are way faster." Even if they were faster (and I would expect the difference to be marginal at best) I always say, "use the highest level abstraction that gets the job done." Also, avoid pre-mature optimization.

Anecdotally speaking, GCD is pretty damn fast. How I see it, portability is the primary advantage of pthreads over GCD. If this is OSX/iOS exclusive code, I would see no advantage whatsoever to using pthreads, absent empirical evidence to the contrary.

like image 28
ipmcc Avatar answered Sep 21 '22 11:09

ipmcc


Ignore the other well thought technical reasons, because they aren't relevant. You are not writing software for a benchmark, are you? At some point, a user is going to sit in front of your device and try to use it. And do you know what happens if you use pthreads instead of GCD? What happens is that your software doesn't scale well in the presence of other software multitasking at the same time because it is going to fight for the CPU presuming it is the only software running at the same time. Which is crazy. Nobody runs single task OSes any more. Even single task iOS runs much stuff in the background.

Instead, if all the programs you were running used GCD, the OS can scale the number of concurrent tasks running on their queues and thus match better the number of actual processors, reducing task switching overhead.

If your program doesn't require pseudo real time low latency and thus a dedicated thread to process stuff as soon as it is available (maybe the definition of your colleague's "way faster"), chances are GCD will be superior for the user because it will use better the resources available on their device. Even if GCD's API was horrible or slow it would be worthwhile to use it over other solutions which don't scale across different processes.

like image 22
Grzegorz Adam Hankiewicz Avatar answered Sep 23 '22 11:09

Grzegorz Adam Hankiewicz