Probably this's simple problem, but I begin my adventure with spark.
Problem: I'd like to get following structure (Expected result) in spark. Now I have following structure.
title1, {word11, word12, word13 ...}
title2, {word12, word22, word23 ...}
Data are stored in Dataset[(String, Seq[String])]
Excepted result I would like to get Tuple [word, title]
word11, {title1}
word12, {title1}
What I do
1. Make (title, seq[word1,word2,word,3])
docs.mapPartitions { iter =>
iter.map {
case (title, contents) => {
val textToLemmas: Seq[String] = toText(....)
(title, textToLemmas)
}
}
}
Thanks for answer.
This should work:
val result = dataSet.flatMap { case (title, words) => words.map((_, title)) }
Another solution is to call the explode function like this :
import org.apache.spark.sql.functions.explode
dataset.withColumn("_2", explode("_2")).as[(String, String)]
Hope this help you, Best Regrads.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With