Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I page output spark-shell

How do I page output written to the spark-shell console? For example, when I run the following command to list the defined terms from my session I often get a long list of output that exceeds the number of rows in my terminal.

$intp.definedTerms.foreach{println(_)}

In a bash shell I would use less to page output from a command or program. Is there paging functionality, similar to less, available to spark-shell?

Thanks.

like image 632
Curt Holden Avatar asked Jan 23 '26 19:01

Curt Holden


1 Answers

spark-shell doesn't but at the end I link to info about Spark's pipe() action on RDDs that let's you fork output to external programs.

WINDOW SCROLLING?

You don't say which environment you are in?

If for example you are in Unity in Ubuntu, or almost any windowing system, would the scrolling function of a terminal window satisfy your needs?

You can edit .bashrc and modify that scrolling setting before calling spark-shell.

There are also some useful GUI-based ways to affect window scrolling: https://askubuntu.com/questions/385901/how-to-see-more-lines-in-the-terminal

Here's another page with more suggestions on editing .bashrc, again more info on your environment would be helpful. https://askubuntu.com/questions/51122/setting-gnome-terminal-window-size-from-within-bashrc

PIPE ACTIONS on RDDS in Spark

Further, without your specific code it's hard to know if this is applicable, there's a way to pipe the output of actions on RDDs to external programs. See http://spark.apache.org/docs/latest/programming-guide.html#transformations, here's an excerpt:

Pipe each partition of the RDD through a shell command, e.g. a Perl or bash script. RDD elements are written to the process's stdin and lines output to its stdout are returned as an RDD of strings.

like image 114
JimLohse Avatar answered Jan 26 '26 14:01

JimLohse