Due to a recent EKS update on AWS I was not anymore able to run spark jobs on AWS (kubernetes client version had to be upgraded). Therefore, I have been building the last Spark snapshot version (2.4.5-SNAPSHOT, it contains the bugfix I need) successfully. Now I want to add it to my project, replacing the old 2.3.3 version.
Unfortunately I get some compilation error (see below).
I am probably doing something wrong with my pom.xml file. The final goal is to fetch jar files from remote and from local (the repo)
Ideas? Thanks!
P.s. Ubuntu 18.04 + intellij
The relevant part of the pom.xml file are the following:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
I add my local repo...
<!-- My local repo where the jar file has been placed -->
<repositories>
<repository>
<id>Local</id>
<name>Repository Spark</name>
<url>/home/cristian/repository/sparkyspark/spark</url>
</repository>
</repositories>
<groupId>sparkjob</groupId>
<artifactId>sparkjob</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<maven.test.skip>true</maven.test.skip>
</properties>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>entry.Main</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<!-- bind to the packaging phase -->
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
<version>1.4.1</version>
<configuration>
<rules><dependencyConvergence/></rules>
</configuration>
</plugin>
</plugins>
</build>
...
<dependencies>
....
....
here it is, the jar file I need
<!-- The last Spark jar file -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.5-SNAPSHOT</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
</exclusion>
</exclusions>
</dependency>
...
....
</dependencies>
This is the error message, the path is correct...the file is there.
Ideas? :)
ERROR:
Could not resolve dependencies for project sparkjob:sparkjob:jar:1.0-SNAPSHOT: Failed to collect dependencies at org.apache.spark:spark-core_2.11:jar:2.4.5-SNAPSHOT: Failed to read artifact descriptor for org.apache.spark:spark-core_2.11:jar:2.4.5-SNAPSHOT: Could not transfer artifact org.apache.spark:spark-core_2.11:pom:2.4.5-SNAPSHOT from/to Local (/home/cristian/repository/sparkyspark/spark): Cannot access /home/cristian/repository/sparkyspark/spark with type default using the available connector factories.....
UPDATE: hard wiring the path seems to be a good work-around...
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.5-SNAPSHOT</version>
<scope>system</scope>
<systemPath>/home/cristian/repository/sparkyspark/spark/spark-core_2.11-2.4.5-SNAPSHOT.jar</systemPath>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
</exclusion>
</exclusions>
</dependency>
If you want to use a folder as a repository you have to use file:// protocol.
So you repository config should be.
<repositories>
<repository>
<id>Local</id>
<name>Repository Spark</name>
<url>file:///home/cristian/repository/sparkyspark/spark</url>
</repository>
</repositories>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With