Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is version library spark supported SparkSession

Code Spark with SparkSession.

   import org.apache.spark.SparkConf
   import org.apache.spark.SparkContext 

   val conf = SparkSession.builder
  .master("local")
  .appName("testing")
  .enableHiveSupport()  // <- enable Hive support.
  .getOrCreate()

Code pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.cms.spark</groupId>
    <artifactId>cms-spark</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>cms-spark</name>

    <pluginRepositories>
        <pluginRepository>
            <id>scala-tools.org</id>
            <name>Scala-tools Maven2 Repository</name>
            <url>http://scala-tools.org/repo-releases</url>
        </pluginRepository>
    </pluginRepositories>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.6.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>1.6.0</version>
        </dependency>

        <dependency>
            <groupId>com.databricks</groupId>
            <artifactId>spark-csv_2.10</artifactId>
            <version>1.4.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_2.10</artifactId>
            <version>1.5.2</version>
        </dependency>

        <dependency>
            <groupId>org.jsoup</groupId>
            <artifactId>jsoup</artifactId>
            <version>1.8.3</version>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.5.3</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id> <!-- this is used for inheritance merges -->
                        <phase>install</phase> <!-- bind to the packaging phase -->
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>

    </build>
</project>

I have some problem. I create code spark with SparkSession, iam get trouble SparkSession not find in library SparkSql. So iam can't run code spark. Iam question what is version to find SparkSession in library Spark. I give code pom.xml.

Thanks.

like image 893
RJK Avatar asked May 20 '16 03:05

RJK


2 Answers

you need both core and SQL artifacts

<repositories>
    <repository>
        <id>cloudera</id>
        <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.0.0-cloudera1-SNAPSHOT</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.0.0-cloudera1-SNAPSHOT</version>
    </dependency>
</dependencies> 
like image 147
mat77 Avatar answered Nov 13 '22 03:11

mat77


You need Spark 2.0 to use SparkSession. It's available in Maven central snapshot repository as for now:

groupId = org.apache.spark
artifactId = spark-core_2.11
version = 2.0.0-SNAPSHOT

The same version have to be specified for other Spark artifacts. Note, that 2.0 is still in beta and expected to be stable in about a month, AFAIK.

Update. Alternatively, you can use Cloudera fork of Spark 2.0:

groupId = org.apache.spark
artifactId = spark-core_2.11
version = 2.0.0-cloudera1-SNAPSHOT

Cloudera repository has to be specified in your Maven repositories list:

<repository>
   <id>cloudera</id>
   <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
like image 23
Vitalii Kotliarenko Avatar answered Nov 13 '22 01:11

Vitalii Kotliarenko