I'm looking to flatten an RDD of tuples (using a no-op map), but I'm getting a type error:
val fromTuples = sc.parallelize( List((1,"a"), (2, "b"), (3, "c")) )
val flattened = fromTuples.flatMap(x => x)
println(flattened.collect().toNiceString)
Gives
error: type mismatch;
found : (Int, String) required: TraversableOnce[?]
val flattened = fromMap.flatMap(x => x)
The equivalent list of List
s or Array
s work fine, e.g.:
val fromList = sc.parallelize(List(List(1, 2), List(3, 4)))
val flattened = fromList.flatMap(x => x)
println(flattened.collect().toNiceString)
Can Scala handle this? If not, why not?
Tuples aren't collections. Unlike Python, where a tuple is essentially just an immutable list, a tuple in Scala is more like a class (or more like a Python namedtuple
). You can't "flatten" a tuple, because it's a heterogeneous group of fields.
You can convert a tuple to something iterable by calling .productIterator
on it, but what you get back is an Iterable[Any]
. You can certainly flatten such a thing, but you've lost all compile-time type protection that way. (Most Scala programmers shudder at the thought of a collection of type Any
.)
There isn't a great way, but you can perserve a little type safety with this method:
val fromTuples = session.sparkContext.parallelize(List((1, "a"), (2, "b"), (3, "c")))
val flattened = fromTuples.flatMap(t => Seq(t._1, t._2))
println(flattened.collect().mkString)
The type of flatten will be an RDD
of whatever the parent of all the types in the tuple. Which, yes, in this case is Any
but if the list were: List(("1", "a"), ("2", "b"))
it would preserve the String
type.
val fromTuples = sc.parallelize(List((1, "a"), (2, "b"), (3, "c")))
val flattened = fromTuples.flatMap(x => Array(x))
flattened.collect()
The reason for your error is
flatMap(func) Similar to map, but each input item can be mapped to 0 or more output items (so func should return a Seq rather than a single item).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With