Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SparklyR removing a Table from Spark Context

Would like to remove a single data table from the Spark Context ('sc'). I know a single cached table can be un-cached, but this isn't the same as removing an object from the sc -- as far as I can gather.

library(sparklyr)
library(dplyr)
library(titanic)
library(Lahman)

spark_install(version = "2.0.0")
sc <- spark_connect(master = "local")

batting_tbl <- copy_to(sc, Lahman::Batting, "batting")
titanic_tbl <- copy_to(sc, titanic_train, "titanic", overwrite = TRUE)
src_tbls(sc) 
# [1] "batting" "titanic"

tbl_cache(sc, "batting") # Speeds up computations -- loaded into memory
src_tbls(sc) 
# [1] "batting" "titanic"

tbl_uncache(sc, "batting")
src_tbls(sc) 
# [1] "batting" "titanic"

To disconnect the complete sc, I would use spark_disconnect(sc), but in this example it would destroy both "titanic" and "batting" tables stored inside of sc.

Rather, I would like to delete e.g., "batting" with something like spark_disconnect(sc, tableToRemove = "batting"), but this doesn't seem possible.

like image 840
eyeOfTheStorm Avatar asked Dec 07 '16 18:12

eyeOfTheStorm


People also ask

How do I drop a table with spark?

DROP TABLE deletes the table and removes the directory associated with the table from the file system if the table is not EXTERNAL table. If the table is not present it throws an exception. In case of an external table, only the associated metadata information is removed from the metastore database.

What is the difference between SparkR and Sparklyr?

sparklyr translates dplyr functions like arrange() into a SQL query plan that is used by SparkSQL. This is not the case with SparkR , which has functions for SparkSQL tables and Spark DataFrames.

What is Sparklyr?

What is Sparklyr? Sparklyr is an open-source package that provides an interface between R and Apache Spark. You can now leverage Spark's capabilities in a modern R environment, due to Spark's ability to interact with distributed data with little latency.


1 Answers

dplyr::db_drop_table(sc, "batting")

I tried this function and it seems work.

like image 109
Sonic Avatar answered Sep 28 '22 03:09

Sonic