I have a list with assorted keywords that may repeat. I need to generate a list with distinct keywords but sorted by the frequency of which they appeared on the original list.
How would be the idiomatic Scala for that? Here is a working but ugly implementation:
val keys = List("c","a","b","b","a","a")
keys.groupBy(p => p).toList.sortWith( (a,b) => a._2.size > b._2.size ).map(_._1)
// List("a","b","c")
Shorter version:
keys.distinct.sortBy(keys count _.==).reverse
That is not particular efficient, however. The groupBy version ought to perform better, though it can be improved:
keys.groupBy(identity).toSeq.sortBy(_._2.size).map(_._1)
One can also get rid of the reverse in the first version by declaring an Ordering:
val ord = Ordering by (keys count (_: String).==)
keys.distinct.sorted(ord.reverse)
Note that reverse in this version just produces a new Ordering that works in the opposite manner of the original. This version also suggests a way to get better performance:
val freq = collection.mutable.Map.empty[String, Int] withDefaultValue 0
keys foreach (k => freq(k) += 1)
val ord = Ordering by freq
keys.distinct.sorted(ord.reverse)
Nothing wrong with that implementation that comments can't fix! Seriously, break it down a bit and describe what & why you're taking each step.
Not as "concise" perhaps, but the purpose of concise code in scala is to make code more readable. When concise code is not clear it's time to back up, break up (introduce well named local variables), and comment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With