I'd like to group a sequence into a map of sequences, based on a discriminator of type Option
, similar to the result of the groupBy
method but where values resulting in None
are discarded. Or perhaps, grouping by a PartialFunction
discriminator and discarding those for which the partial function is not defined.
Here is a concrete example:
I have a collection of names, and a collection of namespaces. Some, but not all, of the names belong in a valid namespace, and I want to group those that do into a Map, discarding those that don't.
Currently my solution is equivalent to:
val names = List("ns1.foo", "ns2.bar", "ns2.baz", "froznit")
val namespaces = List("ns1", "ns2")
def findNamespace(n: String): Option[String] = namespaces.find(n.startsWith)
val groupedNames = names.groupBy(findNamespace).collect {
case (Some(ns), name) => (ns, name)
}
// Map((ns1,List(ns1.foo)), (ns2,List(ns2.bar, ns2.baz)))
My concern about this solution is that, using names.groupBy(findNamespace)
, I'm creating an intermediate Map which contains all the names I don't care about, under the key None
. If the number of names which I discard becomes large, this solution becomes less attractive.
My attempt to avoid this is a bit of a trainwreck, though:
val groupedNames =
names.
map(n => (findNamespace(n), n)).
collect({ case (Some(ns), n) => (ns, n) }).
groupBy(_._1).
map({ case (ns, names) => (ns, names.map(_._2)) })
If you were to solve this in a smarter way, what would it be?
Edit: ideally, the solution should only call findNamespace(name)
once for each name, and build the Map using only the Option[String]
values, without calling a separate hasNamespace(name)
predicate.
One way to avoid collecting the discarded names is to use flatMap
:
names.flatMap(n => findNamespace(n) map (ns => (ns, n)))
.groupBy(_._1)
.map { case (ns, pairs) => (ns, pairs map (_._2)) }
You can achieve the same thing with a for-comprehension:
(for (n <- names; ns <- findNamespace(n)) yield (ns, n))
.groupBy(_._1)
.map { case (ns, pairs) => (ns, pairs map (_._2)) }
You can use a foldLeft:
val gn = names.foldLeft(Map[String, List[String]]()){ case (acc, name) =>
findNamespace(name) match {
case Some(ns) => acc + (ns -> (name :: acc.get(ns).getOrElse(Nil)))
case _ => acc
}
}
Assuming order does not matter otherwise you can reverse the values with gn.mapValues(_.reverse)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With