Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala - Update RDD with another Map

I'm trying to update an RDD with more information from another Map....I wrote this but is not working.

Where:

LocalCurrencies is a Sequence of Currency class

rdd: RDD[String, String]

...
val localCurrencies = Await.result(CurrencyDAO.currencies, 30 seconds)

//update ISO3
rdd.map(r => r.updated("currencyiso3", localCurrencies.find(c => c.CurrencyId ==   
rdd.get("currencyid")).get.ISO3))

//Update exponent
rdd.map(r => r.updated("exponent", localCurrencies.find(c => c.CurrencyId == 
rdd.get("currencyid")).get.Exponent))

Any suggestion ?

Thanks

like image 623
Adetiloye Philip Kehinde Avatar asked Mar 02 '26 06:03

Adetiloye Philip Kehinde


1 Answers

map doesn't modify an RDD, it creates a new one (the same applies to every Spark transformation). If you don't actually do anything with this new RDD, Spark won't even bother creating it. So you want to write

val rdd1 = rdd.map(...).map(...) // better to combine two `map`s into one

and work with rdd1 from then one (you can still use rdd as well, if needed). This isn't necessarily the only error, but you'll still need to fix it.

like image 56
Alexey Romanov Avatar answered Mar 04 '26 18:03

Alexey Romanov