Spark java : Creating a new Dataset with a given schema

Tags:

I have this code that is working well in scala :

val schema = StructType(Array(
        StructField("field1", StringType, true),
        StructField("field2", TimestampType, true),
        StructField("field3", DoubleType, true),
        StructField("field4", StringType, true),
        StructField("field5", StringType, true)
    ))

val df = spark.read
    // some options
    .schema(schema)
    .load(myEndpoint)

I want to do something similar in Java. So my code is the following :

final StructType schema = new StructType(new StructField[] {
     new StructField("field1",  new StringType(), true,new Metadata()),
     new StructField("field2", new TimestampType(), true,new Metadata()),
     new StructField("field3", new StringType(), true,new Metadata()),
     new StructField("field4", new StringType(), true,new Metadata()),
     new StructField("field5", new StringType(), true,new Metadata())
});

Dataset<Row> df = spark.read()
    // some options
    .schema(schema)
    .load(myEndpoint);

But this give me the following error :

Exception in thread "main" scala.MatchError: org.apache.spark.sql.types.StringType@37c5b8e8 (of class org.apache.spark.sql.types.StringType)

Nothing seem wrong with my schemas so I don't really know what the problem is here.

spark.read().load(myEndpoint).printSchema();
root
 |-- field5: string (nullable = true)
 |-- field2: timestamp (nullable = true)
 |-- field1: string (nullable = true)
 |-- field4: string (nullable = true)
 |-- field3: string (nullable = true)

schema.printTreeString();
root
 |-- field1: string (nullable = true)
 |-- field2: timestamp (nullable = true)
 |-- field3: string (nullable = true)
 |-- field4: string (nullable = true)
 |-- field5: string (nullable = true)

EDIT :

Here is a data sample :

spark.read().load(myEndpoint).show(false);
+---------------------------------------------------------------+-------------------+-------------+--------------+---------+
|field5                                                         |field2             |field1       |field4        |field3   |
+---------------------------------------------------------------+-------------------+-------------+--------------+---------+
|{"fieldA":"AAA","fieldB":"BBB","fieldC":"CCC","fieldD":"DDD"}  |2018-01-20 16:54:50|SOME_VALUE   |SOME_VALUE    |0.0      |
|{"fieldA":"AAA","fieldB":"BBB","fieldC":"CCC","fieldD":"DDD"}  |2018-01-20 16:58:50|SOME_VALUE   |SOME_VALUE    |50.0     |
|{"fieldA":"AAA","fieldB":"BBB","fieldC":"CCC","fieldD":"DDD"}  |2018-01-20 17:00:50|SOME_VALUE   |SOME_VALUE    |20.0     |
|{"fieldA":"AAA","fieldB":"BBB","fieldC":"CCC","fieldD":"DDD"}  |2018-01-20 18:04:50|SOME_VALUE   |SOME_VALUE    |10.0     |
 ...
+---------------------------------------------------------------+-------------------+-------------+--------------+---------+

564

asked Aug 01 '18 14:08

Nakeuh

1 Answers

Using the static methods and fields from the Datatypes class instead the constructors worked for me in Spark 2.3.1:

    StructType schema = DataTypes.createStructType(new StructField[] {
            DataTypes.createStructField("field1",  DataTypes.StringType, true),
            DataTypes.createStructField("field2", DataTypes.TimestampType, true),
            DataTypes.createStructField("field3", DataTypes.StringType, true),
            DataTypes.createStructField("field4", DataTypes.StringType, true),
            DataTypes.createStructField("field5", DataTypes.StringType, true)
    });

answered Sep 30 '22 03:09

Álvaro Valencia

Related questions
                            
                                How to visualize recursion
                            
                                Mockito matcher to match a method with generics and a supplier
                            
                                Default -Xss value on Windows for JDK 8
                            
                                Jackson JsonParseExceptionMapper and JsonMappingExceptionMapper shadows custom mapper
                            
                                Parsing the ISO-8601 duration values of the AMAZON.DURATION slot type
                            
                                Why does Optional<T> not implement Supplier<T>?
                            
                                Copy stack trace from IntelliJ Idea
                            
                                Spring Web Reactive client
                            
                                Java exceptions wrapping: bad practice?
                            
                                ConstraintLayout View size - ratio with parent
                            
                                Selenium not detecting the second window in IE
                            
                                Send data in multiple ways depending on how you want to send it
                            
                                Should I test the main() method of Spring Boot Application and how?
                            
                                Does Files.lines read all lines into memory?
                            
                                How to get the JRE to bundle with launch4j?
                            
                                How to check if a user already exists in firebase during phone auth
                            
                                Mockito FindIterable<Document>
                            
                                The method builder() is undefined for the type BuilderExample
                            
                                How to replace the method with Java 8 streams?
                            
                                java.net.UnknownHostException dockerized mysql from spring boot application

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark java : Creating a new Dataset with a given schema

Tags:

java

scala

apache-spark

apache-spark-dataset

Nakeuh

People also ask

1 Answers

Álvaro Valencia

Recent Activity

Donate For Us