Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Asynchronous Logging With Log4J2 in Spark Scala Application

The Issue: I am unable to observe the asynchronous capabilities of Log4J2 after initializing SparkContext in Spark local mode.

Log4j2 dependencies in SBT:

  "com.lmax" % "disruptor" % "3.3.5",
  "org.apache.logging.log4j" % "log4j-api" % "2.8.2",
  "org.apache.logging.log4j" % "log4j-core" % "2.8.2",
  "org.apache.logging.log4j" %% "log4j-api-scala" % "2.8.2"

Log4j2 config file:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="debug">
<Appenders>
    <Console name="Console-Appender" target="SYSTEM_OUT">
        <PatternLayout>
            <pattern>
                [%-5level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %c{1} - %msg%n
            </pattern>>
        </PatternLayout>
    </Console>
    <File name="File-Appender" fileName="logs/xmlfilelog.log" >
        <PatternLayout>
            <pattern>
                [%-5level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %c{1} - %msg%n
            </pattern>
        </PatternLayout>
    </File>
</Appenders>
<Loggers>
    <Logger  name="guru.springframework.blog.log4j2async" level="debug">
        <AppenderRef ref="File-Appender"/>he preceding c
    </Logger>
    <Root level="debug">
        <AppenderRef ref="Console-Appender"/>
    </Root>
</Loggers>

I set the following system property in IntelliJ

-DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector

To test asynchronous behavior, I ran the following code snippet before and after initializing SparkContext:

val start = System.nanoTime()

for(i <- 1 to 1000) {
  logger.error("Hello")
}

val end = System.nanoTime()
val timeMS = (end - start)/1000000
println(s"Parsed ${iterations} reports in ${timeMS} ms ${timeMS/1000} sec")

Successful result: I was able to see the following debug line confirming AsyncContext was enabled 2017-04-25 14:55:40,541 main DEBUG LoggerContext[name=AsyncContext@6d9c638, org.apache.logging.log4j.core.async.AsyncLoggerContext@758f4f03] started OK.. Additionally, my print line statement, "Parsed..." displayed somewhere in the middle of the logger output, indicating async behavior.

However, once I moved this code block after the SparkContext initialization, I no longer observer async activity. The print line statement, "Parsed..." always displays after the logger output. What is confusing is that 2017-04-25 14:55:40,541 main DEBUG LoggerContext[name=AsyncContext@6d9c638, org.apache.logging.log4j.core.async.AsyncLoggerContext@758f4f03] started OK.. is still observed.

For a sanity check, I also printed the package of the logger api inside the scope of the spark application. The result yielded package org.apache.logging.log4j.scala, Apache Log4j Scala 2.11 wrapper for Log4j API, version 2.8.2. According to the Log4J2 docs, "og4j 2 Scala API is dependent on the Log4j 2 API." Thus, I believe I can deduce that I am actually using Log4J2.

MAIN QUESTION: IS SPARK SOMEHOW INTERCEPTING MY LOG4J2 CONFIGURATION AND DEFAULTING TO ITS LOG4J CONFIGURATION, OR DOES IT NOT ALLOW ASYNC LOGGING BEHAVIOR?

like image 574
Ryan Stack Avatar asked Nov 07 '22 22:11

Ryan Stack


1 Answers

The only thing that you should make sure is that the system property is set before Log4j2 is initialized.

-DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector

It seems to me that this is done correctly: If you can see that an AsyncLoggerContext is used then you can be sure you are using Disruptor-backed Async Loggers.

like image 189
Remko Popma Avatar answered Nov 15 '22 12:11

Remko Popma