Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sparklyr write data to hdfs or hive

Tags:

sparklyr

I tried using sparklyr to write data to hdfs or hive , but was unable to find a way . Is it even possible to write a R dataframe to hdfs or hive using sparklyr ? Please note , my R and hadoop are running on two different servers , thus I need a way to write to a remote hdfs from R .

Regards Rahul

like image 916
Rahul Avatar asked Jun 27 '17 21:06

Rahul


2 Answers

Writing Spark table to hive using Sparklyr:

iris_spark_table <- copy_to(sc, iris, overwrite = TRUE)
sdf_copy_to(sc, iris_spark_table)
DBI::dbGetQuery(sc, "create table iris_hive as SELECT * FROM iris_spark_table")
like image 143
Jeereddy Avatar answered Nov 10 '22 21:11

Jeereddy


As of latest sparklyr you can use spark_write_table. pass in the format database.table_name to specify a database

iris_spark_table <- copy_to(sc, iris, overwrite = TRUE)
spark_write_table(
  iris_spark_table, 
  name = 'my_database.iris_hive ', 
  mode = 'overwrite'
)

Also see this SO post here where i got some input on more options

like image 4
blakiseskream Avatar answered Nov 10 '22 22:11

blakiseskream