Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark - extracting single value from DataFrame

I have a Spark DataFrame query that is guaranteed to return single column with single Int value. What is the best way to extract this value as Int from the resulting DataFrame?

like image 220
TheMP Avatar asked Aug 12 '15 10:08

TheMP


People also ask

How do I extract values from a column in PySpark?

In order to convert Spark DataFrame Column to List, first select() the column you want, next use the Spark map() transformation to convert the Row to String, finally collect() the data to the driver which returns an Array[String] .

How do I get data from PySpark DataFrame?

PySpark Collect() – Retrieve data from DataFrame. Collect() is the function, operation for RDD or Dataframe that is used to retrieve the data from the Dataframe. It is used useful in retrieving all the elements of the row from each partition in an RDD and brings that over the driver node/program.


2 Answers

You can use head

df.head().getInt(0) 

or first

df.first().getInt(0) 

Check DataFrame scala docs for more details

like image 56
kostya Avatar answered Sep 20 '22 23:09

kostya


This could solve your problem.

df.map{     row => row.getInt(0) }.first() 
like image 30
Till Rohrmann Avatar answered Sep 24 '22 23:09

Till Rohrmann