Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark output: log-style vs progress-style

spark-submit output on two different clusters (both run spark 1.2) look different: one is "log-style", i.e., a voluminous stream of messages like

15/04/06 14:53:13 INFO TaskSetManager: Starting task 262.0 in stage 4.0 (TID 894, XXXXX, PROCESS_LOCAL, 1785 bytes)
15/04/06 14:53:13 INFO TaskSetManager: Finished task 255.0 in stage 4.0 (TID 892) in 155 ms on XXXXX (288/300)
15/04/06 14:53:13 INFO BlockManagerInfo: Added rdd_16_262 in memory on XXXXX:49388 (size: 14.3 MB, free: 1214.5 MB)
15/04/06 14:53:13 INFO TaskSetManager: Finished task 293.0 in stage 4.0 (TID 893) in 156 ms on XXXXX (289/300)
15/04/06 14:53:13 INFO TaskSetManager: Finished task 262.0 in stage 4.0 (TID 894) in 168 ms on XXXXX (290/300)
15/04/06 14:53:16 INFO TaskSetManager: Starting task 1.0 in stage 4.0 (TID 895, ip-10-0-3-92.ec2.internal, NODE_LOCAL, 1785 bytes)
15/04/06 14:53:16 INFO TaskSetManager: Starting task 74.0 in stage 4.0 (TID 896, XXXXX, NODE_LOCAL, 1785 bytes)

and the other "progress-style", i.e., a growing progress bar at the bottom of the screen (which may be interrupted by errors, if any).

How do I switch between the two styles? (either on a per-job or a per-cluster basis)

I tried passing --conf spark.ui.showConsoleProgress=true to spark-submit with no effect.

like image 593
sds Avatar asked Apr 27 '15 12:04

sds


1 Answers

I have encountered this before, My situation that time is just because different log4j.rootCategory levels are set in conf/log4j.properties between the two clusters.

The "progress-style" output occurs in the cluster have WARN level of logging, while "Log-style" occurs when I set logging level as INFO

Update (2015-05-10):

Come across the _progressBar startup logic in SparkContext, in branch-1.4.0, actually controlled by two conditions:

_progressBar =
  if (_conf.getBoolean("spark.ui.showConsoleProgress", true) && !log.isInfoEnabled) {
    Some(new ConsoleProgressBar(this))
  } else {
    None
  }

Therefore, to enable the progress-style output in Console, you have to both set spark.ui.showConsoleProgress to true and upgrade your log level in conf/log4j.properties to Not enabling Info, i.e, WARN or ERROR.

like image 84
yjshen Avatar answered Oct 28 '22 04:10

yjshen