Retain keys with null values while writing JSON in spark

Tags:

I am trying to write a JSON file using spark. There are some keys that have null as value. These show up just fine in the DataSet, but when I write the file, the keys get dropped. How do I ensure they are retained?

code to write the file:

ddp.coalesce(20).write().mode("overwrite").json("hdfs://localhost:9000/user/dedupe_employee");

part of JSON data from source:

"event_header": {
        "accept_language": null,
        "app_id": "App_ID",
        "app_name": null,
        "client_ip_address": "IP",
        "event_id": "ID",
        "event_timestamp": null,
        "offering_id": "Offering",
        "server_ip_address": "IP",
        "server_timestamp": 1492565987565,
        "topic_name": "Topic",
        "version": "1.0"
    }

Output:

"event_header": {
        "app_id": "App_ID",
        "client_ip_address": "IP",
        "event_id": "ID",
        "offering_id": "Offering",
        "server_ip_address": "IP",
        "server_timestamp": 1492565987565,
        "topic_name": "Topic",
        "version": "1.0"
    }

In the above example keys accept_language, app_name and event_timestamp have been dropped.

461

asked May 30 '17 20:05

Vaishak Suresh

1 Answers

If you are on Spark 3, you can add

spark.sql.jsonGenerator.ignoreNullFields false

183

answered Sep 28 '22 08:09

mani_nz

Related questions
                            
                                Refactor a switch case with java lambdas
                            
                                How to avoid the specific feature versions in eclipse target definitions
                            
                                Configuring groovy SDK within Intellij: have to repeat this each time project is built?
                            
                                Replace String With hashmap value using java 8 stream
                            
                                Sonar wants to close the Stream [duplicate]
                            
                                Gradle Could not find method “() for arguments on root project
                            
                                Get user that runs an asynchronous method
                            
                                What is stackmap table in jvm bytecode?
                            
                                JDBC Prepared statement parameter inside json
                            
                                Does Java have an StringStream equivalent?
                            
                                Spring boot - rest template and rest template builder
                            
                                Why this converter needs casting?
                            
                                Spark Driver Memory and Executor Memory
                            
                                Why doesn't IntelliJ Idea recognize my Spek tests?
                            
                                Java 8 Comparator comparing doesn't chain
                            
                                Spring boot with embedded tomcat + access log with authentication user
                            
                                Catch exception from parallel stream
                            
                                How to get heap usage using jstat?
                            
                                Default interface method for abstract superclass
                            
                                Spring Boot - where to place the jsp files

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Retain keys with null values while writing JSON in spark

Tags:

java

json

apache-spark

apache-spark-sql

Vaishak Suresh

People also ask

1 Answers

mani_nz

Recent Activity

Donate For Us