Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When will parallel increase performance

I'm trying to understand when the usage of parallel will increase the performance.
I tested it with a simple code that ran over 100,000 items in List<Person> and changed the name of each one to string.Empty.

The parallel version took twice the time then the regular version. (Yes I tested with more the one core...)

I saw this answer saying a slice of data that not always the parallel is good for performance.
Also this caution repeated in each page of the parallel examples in the MSDN tutorial:

These examples are primarily intended to demonstrate usage, and may or may not run faster than the equivalent sequential LINQ to Objects queries

I need some rules and tips when parallel will increase the performance of my code and when will not.
The obvious answer is "Test your code, if the parallel loop is faster use it", is absolutely right, but I guess no one run performance analysis on each loop he writes.

like image 307
gdoron is supporting Monica Avatar asked Nov 27 '22 14:11

gdoron is supporting Monica


1 Answers

Think about when it is worthwhile to parallelize something in real life. When is it better to just sit down and do a job yourself from start to finish, and when is it better to hire twenty guys?

  • Is the work inherently parallelizable or inherently serial? Some jobs are not parallelizable at all: nine women can't work together to make one baby in a month. Some jobs are parallelizable but give lousy results: you could hire twenty guys and assign each of them fifty pages of War and Peace to read for you, and then have each of them write one twentieth of an essay, glue all the essay fragments together and submit the paper; that's unlikely to result in a good grade. Some jobs are very parallelizable: twenty guys with shovels can dig a hole much faster than one guy.

  • If the work is inherently parallelizable, does parallelization actually save time? You can cook a pot of spaghetti with a hundred noodles in it, or you can cook twenty pots of spaghetti with five noodles in each and pour the results together at the end. I guarantee you that parallelizing the task of cooking spaghetti does not result in getting your dinner any faster.

  • If the work is inherently parallelizable, and there is a possible time savings, does the cost of hiring those guys pay for the savings in time? If it's faster to just do the job yourself than it is to hire the guys, parallelization is not a win. Hiring twenty guys to do a job that takes you five seconds, and hoping that they'll get it done in a quarter second is not a savings if it takes you a day to find the guys.

Parallelization tends to be a win when the work is enormous and parallelizable. Setting a hundred thousand pointers to null is something a computer can do in a tiny fraction of a second; there's no enormous cost, so there's no savings. Try doing something non-trivial; say, write a compiler and do semantic analysis of method bodies in parallel. You'll be more likely to get a win there.

like image 114
Eric Lippert Avatar answered Dec 10 '22 13:12

Eric Lippert