How do I page output written to the spark-shell console? For example, when I run the following command to list the defined terms from my session I often get a long list of output that exceeds the number of rows in my terminal.
$intp.definedTerms.foreach{println(_)}
In a bash shell I would use less to page output from a command or program. Is there paging functionality, similar to less, available to spark-shell?
Thanks.
spark-shell doesn't but at the end I link to info about Spark's pipe() action on RDDs that let's you fork output to external programs.
WINDOW SCROLLING?
You don't say which environment you are in?
If for example you are in Unity in Ubuntu, or almost any windowing system, would the scrolling function of a terminal window satisfy your needs?
You can edit .bashrc and modify that scrolling setting before calling spark-shell.
There are also some useful GUI-based ways to affect window scrolling: https://askubuntu.com/questions/385901/how-to-see-more-lines-in-the-terminal
Here's another page with more suggestions on editing .bashrc, again more info on your environment would be helpful. https://askubuntu.com/questions/51122/setting-gnome-terminal-window-size-from-within-bashrc
PIPE ACTIONS on RDDS in Spark
Further, without your specific code it's hard to know if this is applicable, there's a way to pipe the output of actions on RDDs to external programs. See http://spark.apache.org/docs/latest/programming-guide.html#transformations, here's an excerpt:
Pipe each partition of the RDD through a shell command, e.g. a Perl or bash script. RDD elements are written to the process's stdin and lines output to its stdout are returned as an RDD of strings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With