Scala appears to have a ton of features and improvements over Java. I am having a hard time isolating the stuff I want to learn first about Scala. What should I look for on Google if I just want to, for example, take for loops and make them run over multiple threads or processes? I am coming from a GPU computing background where it was really simple to get a high-level view of how make things run faster.
Threads in Scala can be created by using two mechanisms : Extending the Thread class. Extending the Runnable Interface.
Scala supports multithreading, which means we can execute multiple threads at once. We can perform multiple operations independently and achieve multitasking. This lets us develop concurrent applications.
In the same multithreaded process in a shared-memory multiprocessor environment, each thread in the process can run concurrently on a separate processor, resulting in parallel execution, which is true simultaneous execution.
Scala's parallel collections are particularly easy. Parallelizing an expensive operation f(i)
over integers i <- 1 to 10
is as simple as,
(1 to 10).par.map(i => f(i))
Scala will try to allocate a number of work threads that is comparable to the number of cores/processors available in your system. Here's a video with more details: http://days2010.scala-lang.org/node/138/140
The Akka framework is a mature, primarily actor based approach to concurrency that allows parallelizing over threads or remote processes. Actors are basically threads that can pass messages rather than share state. The newly formed company Typesafe is developing both the Scala language and Akka.
You can also try a draft release of a software transactional memory (STM) library for Scala. The library is aiming for inclusion in the standard Scala distribution. Compared to manually managed threads, STM is a simpler model of concurrency that reduces the possibility of errors such as dead-locks. It works by grouping sequences of communication operations into a single synchronized block that can fail and roll back if multiple threads do something with shared state that turns out to be mutually incompatible. Presumably there is some performance cost to pay for convenience; I'm not sure how well STM scales to large numbers of threads.
The Spark framework addresses cluster computing, and seems to be a generalization of MapReduce.
For GPU programming from Scala, there's ScalaCL. You can also use the Java bindings.
There's also the very interesting language virtualization work being done as a joint effort between the labs at Stanford and EPFL. This page has links to papers, and there is a course at Stanford with many more links. There are several exciting applications for developing DSL's for high performance computing over heterogenous computing environments, including GPUs.
Update. Daniel Sobral also suggested the Tools and Libraries wiki.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With