Using PySpark in a Jupyter notebook, the output of Spark's <code>DataFrame.show</code> is low-tech compared to how Pandas DataFrames are displayed. I thought "Well, it does the job", until I got this: <img src="https://i.stack.imgur.com/1k6OP.png" alt="enter image description here"> The output is not adjusted to the width of the notebook, so that the lines wrap in an ugly way. Is there a way to customize this? Even better, is there a way to get output Pandas-style (without converting to <code>pandas.DataFrame</code> obviously)?

This is now possible natively as of Spark 2.4.0 by setting <code>spark.sql.repl.eagerEval.enabled</code> to <code>True</code>: <img src="https://i.stack.imgur.com/N7Jv3.png" alt="enter image description here">

After playing around with my table which has a lot of columns I decided the best thing to do to get a feel for the data is to use: <pre class="prettyprint"><code>df.show(n=5, truncate=False, vertical=True) </code></pre> This displays it vertically without truncation and is the cleanest viewing I can come up with.

Improve PySpark DataFrame.show output to fit Jupyter notebook

Tags:

Using PySpark in a Jupyter notebook, the output of Spark's DataFrame.show is low-tech compared to how Pandas DataFrames are displayed. I thought "Well, it does the job", until I got this:

enter image description here

The output is not adjusted to the width of the notebook, so that the lines wrap in an ugly way. Is there a way to customize this? Even better, is there a way to get output Pandas-style (without converting to pandas.DataFrame obviously)?

448

asked May 25 '18 07:05

clstaudt

2 Answers

This is now possible natively as of Spark 2.4.0 by setting spark.sql.repl.eagerEval.enabled to True:

enter image description here

126

answered Sep 30 '22 18:09

Kyle Barron

After playing around with my table which has a lot of columns I decided the best thing to do to get a feel for the data is to use:

df.show(n=5, truncate=False, vertical=True)

This displays it vertically without truncation and is the cleanest viewing I can come up with.

answered Sep 30 '22 17:09

user1761806

Related questions
                            
                                rxjs 6 Property 'of' does not exist on type 'typeof Observable'
                            
                                JSON Parse error: Unrecognized token'<' - react-native
                            
                                Does TensorFlow 1.9 support Python 3.7
                            
                                Flutter: Add box shadow to a transparent Container
                            
                                How to LEFT ANTI join under some matching condition
                            
                                How do you delete a sprint in VSTS (Visual Studio Team Services)
                            
                                Sorting an Array in Random Order
                            
                                POSTing to external API throws CORS but it works from Postman
                            
                                Git Bash - string parameter with '/' at start is being expanded to a file path. How to stop this?
                            
                                Stream.findFirst different than Optional.of?
                            
                                Is there a way to properly mock Reselect selectors for unit testing?
                            
                                error C2039: 'IsNearDeath': is not a member of 'Nan::Persistent<v8::Object,v8 ::NonCopyablePersistentTraits<T>>

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With