Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Dropbox use so many threads?

My understanding of threads is that you can only have one thread per core, two with hyper threading, before you start losing efficiency.

This computer has eight cores and so should work best with 8/16 threads then, yet many applications use several times that, especially Dropbox.

Dropbox windows 7 process, 104 threads highlighted.

It also uses 95 threads while idling on my laptop, which only has 4 cores.

Why is this the case? Does it have so many threads for programming convenience, have I misunderstood threading efficiency or is it something else entirely?

like image 288
Nick Coughlin Avatar asked Apr 24 '17 08:04

Nick Coughlin


2 Answers

I took a peek at the Mac version of the client, and it seems to be written in Python and it uses several frameworks.

  • A bunch of threads seem to be used in some in house actor system
  • They use nucleus for app analytics
  • There seems to be a p2p network
  • some networking threads (one per hype core)
  • a global pool (one per physical core)
  • many threads for file monitoring and thumbnail generation
  • task schedulers
  • logging
  • metrics
  • db checkpointing
  • something called infinite configuration
  • etc.

Most are idle.

It looks like a hodgepodge of subsystems, each starting their own threads, but they don't seem too expensive in terms of memory or CPU.

like image 158
juancn Avatar answered Oct 16 '22 17:10

juancn


My understanding of threads is that you can only have one thread per core, two with hyper threading, before you start losing efficiency.

Nope, this is not true. I'm not sure why you think that, but it's not true.

As just the most obvious way to show that it's false, suppose you had that number of threads and one of them accessed a page of memory that wasn't in RAM and had to be loaded to disk. If you don't have any other threads that can run, then one core is wasted for the entire time it takes to read that page of memory from disk.

It's hard to address the misconception directly without knowing what flawed chain of reasoning led to it. But the most common one is that if you have more threads ready-to-run than you can execute at once, then you have lots of context switches and context switches are expensive.

But that is obviously wrong. If all the threads are ready-to-run, then no context switches are necessary. A context switch is only necessary if a running thread stops being ready-to-run.

If all context switches are voluntary, then the implementation can select the optimum number of context switches. And that's precisely what it does.

Having large numbers of threads causes you to lose efficiency if, and only if, lots of threads do a small amount of work and then become no longer ready-to-run while other waiting threads are ready-to-run. That forces the implementation to do a context even where it is not optimal.

Some applications that use lots of threads do in fact do this. And that does result in poor performance. But Dropbox doesn't.

like image 36
David Schwartz Avatar answered Oct 16 '22 19:10

David Schwartz