I need to save the output of df.show() as a string so that i can email it directly.
For ex., the below example taken from official spark docs,:
val df = spark.read.json("examples/src/main/resources/people.json")
// Displays the content of the DataFrame to stdout
df.show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
I need to save the above table as a string which is printed in the console. I did look at log4j to print the log, but couldnt come across any info on logging only the output.
Can someone help me with it?
Workaround is to redirect standard output to variable:
val baos = new java.io.ByteArrayOutputStream();
val ps = new java.io.PrintStream(baos);
val oldPs = Console.out
Console.setOut(ps)
df.show()
val content = baos.toString()
Console.setOut(oldPs)
Note that I have one deprecation warning here.
You can also re-implement method Dataset.showString
, which generated data. It uses take
in background. Maybe it's also a good moment to create PR to make showString
public? :)
scala.Console
has a withOut
method for this kind of thing:
val outCapture = new ByteArrayOutputStream
Console.withOut(outCapture) {
df.show()
}
val result = new String(outCapture.toByteArray)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With