Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

F# task parallelism under Mono doesn't "appear" to execute in parallel

I have the following dummy code to test out TPL in F#. (Mono 4.5, Xamarin studio, quad core MacBook Pro)

To my surprise, all the processes are done on the same thread. There is no parallelism at all.

open System
open System.Threading
open System.Threading.Tasks


let doWork (num:int) (taskId:int) : unit =
    for i in 1 .. num do
        Thread.Sleep(10)
        for j in 1 .. 1000 do
            ()
        Console.WriteLine(String.Format("Task {0} loop: {1}, thread id {2}", taskId, i, Thread.CurrentThread.ManagedThreadId)) 

[<EntryPoint>]
let main argv = 

    let t2 = Task.Factory.StartNew(fun() -> doWork 10 2)
    //printfn "launched t2"
    Console.WriteLine("launched t2")
    let t1 = Task.Factory.StartNew(fun() -> doWork 8 1)
    Console.WriteLine("launched t1")
    let t3 = Task.Factory.StartNew(fun() -> doWork 10 3)
    Console.WriteLine("launched t3")
    let t4 = Task.Factory.StartNew(fun() -> doWork 5 4)
    Console.WriteLine("launched t4")
    Task.WaitAll(t1,t2,t3,t4)
    0 // return an integer exit code

However, if I increase the thread sleep time from 10 to 100ms, I can see a little parallelism.

What have I done wrong? What does this mean? I did consider the possibility of the CPU finished the work before TPL can start the task on a new thread. But this doesn't make sense to me. I can increase the inner dummy loop for j in 1 .. 1000 do () to loop 1000 more times. The result is the same: no parallelism (thread.sleep is set 10 ms).

The same code in C# on the other hand, produces the desired results: all tasks print the message to the window in a mixed order (rather than sequential order)

Update:

As suggested I changed the inner loop to do some 'actual' thing but the result is still execution on the single thread

Update 2:

I don't quite understand Luaan's comments but I just did a test on a friend's PC. And with the same code, parallelism is working (without thread sleep). It looks like something to do with Mono. But can Luaan explain what I should expect from TPL again? If I have tasks that I want to perform in parallel and taking advantage of the multicore CPU, isn't TPL the way to go?

Update 3:

I have tried out @FyodorSoikin's suggestion again with dummy code that won't be optimized away. Unfortunately, the workload still is not able to make Mono TPL to use multiple threads. Currently the only way I can get Mono TPL to allocate multiple threads is to force a sleep on the existing thread for more than 20ms. I am not qualified enough to asset that Mono is wrong, but I can confirm the same code (same benchmark workload) have the different behaviors under Mono and Windows.

like image 429
casbby Avatar asked Dec 19 '22 03:12

casbby


2 Answers

It looks like the Sleeps are ignored completely - see how the Task 2 loop is printed even before launching the next task, that's just silly - if the thread waited for 10ms, there's no way for that to happen.

I'd assume that the cause might be the timer resolution in the OS. The Sleep is far from accurate - it might very well be that Mono (or Mac OS) decides that since they can't reliably make you run again in 10ms, the best choice is to simply let you run right now. This is not how it works on Windows - there you're guaranteed to lose control as long as you don't Sleep(0); you'll always sleep at least as long as you wanted. It seems that on Mono / Mac OS, the idea is the reverse - the OS tries to let you sleep at most the amount of time you specified. If you want to sleep for less time than is the timer precision, too bad - no sleep.

But even if they are not ignored, there's still not a lot of pressure on the thread pool to give you more threads. You're only blocking for less than 100ms, for four tasks in a line - that's not nearly enough for the pool to start creating new threads to handle the requests (on MS.NET, new threads are only spooled after not having any free threads for 200ms, IIRC). You're simply not doing enough work for it to be worth it to spool up new threads!

The point you might be missing is that Task.Factory.StartNew is not actually starting any new threads, ever. Instead, it's scheduling the associated task on the default task scheduler - which just puts it in the thread pool queue, as tasks to execute "at earliest convenience", basically. If there's one free thread in the pool, the first tasks starts running there almost immediately. The second will run when there's another thread free etc. Only if the thread usage is "bad" (i.e. the threads are "blocked" - they're not doing any CPU work, but they're not free either) is the threadpool going to spawn new threads.

like image 138
Luaan Avatar answered May 11 '23 01:05

Luaan


If you look at the IL output from this program, you'll see that the inner loop is optimized away, because it doesn't have any side effects, and its return value is completely ignored.

To make it count, put something non-optimizable there, and also make it heavier: 1000 empty cycles is hardly noticeable compared to the cost of spinning up a new task.

For example:

let doWork (num:int) (taskId:int) : unit =
    for i in 1 .. num do
        Thread.Sleep(10)
        for j in 1 .. 1000 do
            Debug.WriteLine("x")
        Console.WriteLine(String.Format("Task {0} loop: {1}, thread id {2}", taskId, i, Thread.CurrentThread.ManagedThreadId)) 

Update:
Adding a pure function, such as your fact, is no good. The compiler is perfectly able to see that fact has no side effects and that you duly ignore its return value, and therefore, it is perfectly cool to optimize it away. You need to do something that the compiler doesn't know how to optimize, such as Debug.WriteLine above.

like image 35
Fyodor Soikin Avatar answered May 11 '23 01:05

Fyodor Soikin