Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark SQL package not found

I am quite new to Spark, and have the following trouble: when I try to import SQLContext with:

import org.apache.spark.sql.SQLContext;

or try to initialize SQLContext variable explicitly:

SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

I get an error from Eclipse:

The import org.apache.spark.sql.SQLContext cannot be resolved

I have put Spark into the dependency file, and everything else is fine except for the SQLContext. The whole code:

package main.java;

import java.io.Serializable;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;

import org.apache.spark.sql.SQLContext;

public class SparkTests {
    public static void main(String[] args){
        SparkConf conf = new SparkConf().setAppName("SparkMain");
        JavaSparkContext sc = new JavaSparkContext(conf);
        SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

        //DataFrame df = sqlContext
        System.out.println("\n\n\nHello world!\n\n\n");
    }
}

When I try to compile it with mvn package, I get the compilation error:

package org.apache.spark.sql does not exist

Any ideas why the SQL package cannot be found?

EDIT:

The dependency file pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <groupId>edu.berkeley</groupId>
    <artifactId>simple-project</artifactId>
    <modelVersion>4.0.0</modelVersion>
    <name>Simple Project</name>
    <packaging>jar</packaging>
    <version>1.0</version>
    <dependencies>
        <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.6.1</version>
        </dependency>
    </dependencies>
</project>
like image 950
Belphegor Avatar asked Mar 30 '16 08:03

Belphegor


1 Answers

If you want to use Spark SQL or DataFrames in your project you'll have to add spark-sql artifact as a dependency. In this particular case:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId> <!-- matching Scala version -->
    <version>1.6.1</version>  <!-- matching Spark Core version -->
</dependency>

should do the trick.

like image 111
zero323 Avatar answered Oct 11 '22 16:10

zero323