Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spark-shell dependencies, translate from sbt

While checking how to use the cassandra connection, the documentation instructs to add this to the sbt file:

"libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "1.6.0-M1"

In general, is there an obvious, straight forward logic to translate this into the corresponding:

spark-shell --packages "field1":"field2"

I've tried:

spark-shell --packages "com.datastax.spark":"spark-cassandra-connector"

and a few other things but that doesn't work.

like image 841
elelias Avatar asked Feb 08 '23 11:02

elelias


1 Answers

I believe it is --packages "groupId:artifactId:version". If you have multiple packages, you can comma separate them. --packages "groupId1:artifactId1:version1, groupId2:artifactId2:version2"

In sbt

val appDependencies = Seq(
  "com.datastax.spark" % "spark-cassandra-connector_2.10" % "1.6.0-M1"
)

and

val appDependencies = Seq(
  "com.datastax.spark" %% "spark-cassandra-connector" % "1.6.0-M1"
)

are identical. In case you use %% syntax (after the groupId) in sbt, it automatically picks up the artifact for your scala version. So using scala 2.10 it changes your spark-cassandra-connector to spark-cassandra-connector_2.10. Not sure this feature is there when using spark-shell, so you might need to ask for the scala2_10 version of your artifact explicitly like this: --packages "com.datastax.spark:spark-cassandra-connector_2.10:1.6.0-M1"

like image 130
Daniel B. Avatar answered Feb 16 '23 10:02

Daniel B.