My Java spark program ingests a file of 3.7 GB. When I launch the spark program and go to the Spark UI on port localhost:4040 The input size shown for the load stage is 7.3 GB??? That's really confusing. Why is the input size in the Spark UI console showing almost double than the actual file size being ingested?
The input size:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With