Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark SQL has no SparkSqlParser.scala file when compiling in intelliJ idea

I have installed spark-hadoop env in my Red Hat 64. And I also want to read and write code in spark source code project in intelliJ idea. I have downloaded spark source code and make everything ready. But I had some errors when compiling spark project in IntelliJ idea. Here are errors:

/home/xuch/IdeaProjects/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQI.scala

Error:(809, 34) not found: value SparkSqlParser case ast if ast.tokenType == SparlSqlParser.TinyintLiteral =>

Error:(812, 34) not found: value SparkSqlParser case ast if ast.tokenType == SparlSqlParser.SmallintLiteral =>

... ...

But actually I did not find a file named SparkSqlParser.scala in the whole project neither a scala class named SparkSqlParser.

However, I had searched the web for some files named SparkSqlParser.scala, but they don't have attribute like "TinyintLiteral", "SmallintLiteral", etc. Here are the files link:

  • https://github.com/yjshen/zzzzobspk/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSQLParser.scala

  • https://apache.googlesource.com/spark/+/c152dde78f73d5ce3a483fd60a47e7de1f1916da/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SparkSQLParser.scala

like image 757
Lyroe Chan Avatar asked Feb 25 '16 02:02

Lyroe Chan


3 Answers

I meet the same problem. Here is my solution:

  1. Download the antlr4 (i.e. antlr v4) plugin of IntelliJ. Then you can see the file "spark-2.0.1\sql\catalyst\src\main\antlr4\org\apache\spark\sql\catalyst\parser\SqlBase.g4" can be recognized by IntelliJ IDEA.
  2. Navigate to View->Tool Windows->Maven Projects tab. select the project "Spark Project Catalyst". Right click on it. Then select "Generate sources and update folders"
  3. After that you can see some files added into the "spark-2.0.1\sql\catalyst\target\generated-sources\antlr4"
  4. Then you can build success of the project.

Hope it can help you.

like image 71
Wenbin Zhang Avatar answered Oct 15 '22 02:10

Wenbin Zhang


None of the advice here worked for me. I noticed, however, that the generated code depends on Antlr 3.x while Antlr 4.x is what is in the dependencies (mvn dependecy:tree). I don't know why this was the case. Maybe because I had earlier built it from the command line (?).

Anyway, try cleaning your Catalyst sub-project then rebuild the autogenerated sources. To do this in IntelliJ, go to View -> Tools Window -> Maven Projects.

Then navigate to the "Spark Project Catalyst" in the "Maven Project" tab.

Navigate to clean -> clean:clean and double click it. Navigate to Plugins -> antlr4 -> antlr4:antlr4 and double click it.

Now, you'll see the autogenerated sources of the Antlr classes are different and they should compile. YMMV.

like image 5
PHenry Avatar answered Oct 15 '22 01:10

PHenry


1) First build your Spark from command line using build instructions given in http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn

2) Then check $SPARK_HOME/sql/catalyst/target/generated-sources/antlr3/org/apache/spark/sql/catalyst/parser folder.

Some of the generated classes like SparkSqlLexer.java is there.

List of classes it generates are

    SparkSqlLexer.java[enter link description here][1]    
    SparkSqlParser.java
    SparkSqlParser_ExpressionParser.java
    SparkSqlParser_FromClauseParser.java
    SparkSqlParser_IdentifiersParser.java
    SparkSqlParser_KeywordParser.java
    SparkSqlParser_SelectClauseParser.java

3) Open Module Settings. Click on spark-catalyst module. Go to Source tab in the right. Make target/generated-source as a source folder. Attaching a pic to give an idea.

like image 3
Rishitesh Mishra Avatar answered Oct 15 '22 03:10

Rishitesh Mishra