In spark-submit, how to specify log4j.properties ?
Here is my script. I have tried all of combinations and even just use one local node. but looks like the log4j.properties is not loaded, all debug level info was dumped.
current_dir=/tmp
DRIVER_JAVA_OPTIONS="-Dlog4j.configuration=file://${current_dir}/log4j.properties "
spark-submit \
--conf "spark.driver.extraClassPath=$current_dir/lib/*" \
--conf "spark.driver.extraJavaOptions=-Djava.security.krb5.conf=${current_dir}/config/krb5.conf -Djava.security.auth.login.config=${current_dir}/config/mssqldriver.conf" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file://${curent_dir}/log4j.properties " \
--class "my.AppMain" \
--files ${current_dir}/log4j.properties \
--master local[1] \
--driver-java-options "$DRIVER_JAVA_OPTIONS" \
--num-executors 4 \
--driver-memory 16g \
--executor-cores 10 \
--executor-memory 6g \
$current_dir/my-app-SNAPSHOT-assembly.jar
log4j properties:
log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
log4j.additivity.org=false
log4j.logger.org=WARN
parquet.hadoop=WARN
log4j.logger.com.barcap.eddi=WARN
log4j.logger.com.barcap.mercury=WARN
log4j.logger.yarn=WARN
log4j.logger.io.netty=WARN
log4j.logger.Remoting=WARN
log4j.logger.org.apache.hadoop=ERROR
# this disables the table creation logging which is so verbose
log4j.logger.hive.ql.parse.ParseDriver=WARN
# this disables pagination nonsense when running in combined mode
log4j.logger.com.barcap.risk.webservice.servlet.PaginationFactory=WARN
example-spark/src/main/resources/log4j. properties.
Pay attention the Spark worker is not your Java application, so you can't use a log4j.properties file from the class-path.
To understand how Spark on YARN will read a log4j.properties
file, you can use the log4j.debug=true
flag:
spark.executor.extraJavaOptions=-Dlog4j.debug=true
Most of the time, the error is that the file is not found/available from the worker YARN container. There is a very useful Spark directive that allows to share file: --files
.
--files "./log4j.properties"
This will make this file available from all your driver/workers. Add Java extra options:
-Dlog4j.configuration=log4j.properties
Et voilà!
log4j: Using URL [file:/var/log/ambari-server/hadoop/yarn/local/usercache/hdfs/appcache/application_1524817715596_3370/container_e52_1524817715596_3370_01_000002/log4j.properties] for automatic log4j configuration.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With