I was asked to how to capture logging output from a Spring Boot application that runs as a Java action from Oozie.
My initial thought was that it would be possible edit some log4j properties to capture the application logs inside YARN or Oozie. Then it occurred to me that that Kafka would be a much easier way to capture and aggregate log messages, for a specific application, running on various cluster nodes. It's much easier to monitor a distributed system by subscribing to a topic than fishing through log files.
I notice that Kafka has a log4j appender and so I tried to create a minimal reproducible example (posted on github: https://github.com/alexwoolford/spring-boot-log-to-kafka-example). Here's a snippet from pom.xml:
<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>1.4.4.RELEASE</version>
</parent>
<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter</artifactId>
        <exclusions>
            <exclusion>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-starter-logging</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.springframework.boot</groupId>
                <artifactId>logback-classic</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-log4j</artifactId>
    </dependency>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-log4j-appender</artifactId>
        <version>0.10.0.0</version>
    </dependency>
    <dependency>
        <groupId>net.logstash.log4j</groupId>
        <artifactId>jsonevent-layout</artifactId>
        <version>1.7</version>
    </dependency>
    <dependency>
        <groupId>commons-logging</groupId>
        <artifactId>commons-logging</artifactId>
        <version>1.2</version>
    </dependency>
</dependencies>
My log4j.properties file looks like this:
log4j.rootLogger=INFO
log4j.appender.KAFKA=org.apache.kafka.log4jappender.KafkaLog4jAppender
log4j.appender.KAFKA.layout=net.logstash.log4j.JSONEventLayoutV1
log4j.appender.KAFKA.topic=logs
log4j.appender.KAFKA.brokerList=hdp-single-node:6667
log4j.appender.KAFKA.syncSend=true
log4j.appender.KAFKA.producer.type=async
log4j.logger.io.woolford=INFO, KAFKA
This works, except that it generates a warning:
log4j:WARN No appenders could be found for logger (org.apache.kafka.clients.producer.ProducerConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Even though this application runs and does what I need it to, the warnings suggest that I've misconfigured something. Can you see what needs to be changed?
Also, I notice that Spring Boot, by default, uses Logback and I notice that there's an open source project, logback-kafka-appender, that allows Logback to append to Kafka. Is the Kafka log4j appender the best way for Spring Boot to log to Kafka?
Log4j2 has a Kafka appender. It was necessary to add the spring-boot-starter-log4j2 and jackson-databind artifacts to the pom.xml:
<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter</artifactId>
        <exclusions>
            <exclusion>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-starter-logging</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.springframework.boot</groupId>
                <artifactId>logback-classic</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-log4j2</artifactId>
    </dependency>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-log4j-appender</artifactId>
        <version>0.10.0.0</version>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
        <version>2.8.6</version>
    </dependency>
</dependencies>
I then created an XML formatted log4j2.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="info" name="spring-boot-log-to-kafka-example" packages="io.woolford">
    <Appenders>
        <Kafka name="kafkaAppender" topic="logs">
            <JSONLayout />
            <Property name="bootstrap.servers">hdp-single-node:6667</Property>
        </Kafka>
    </Appenders>
    <Loggers>
        <Root level="INFO">
            <AppenderRef ref="kafkaAppender"/>
        </Root>
        <Logger name="org.apache.kafka" level="WARN" />
    </Loggers>
</Configuration>
The logging messages are sent to Kafka in JSON format, e.g.
{
    "timeMillis": 1485736022854,
    "thread": "Thread-1",
    "level": "INFO",
    "loggerName": "org.springframework.context.annotation.AnnotationConfigApplicationContext",
    "message": "Closing org.springframework.context.annotation.AnnotationConfigApplicationContext@20140db9: startup date [Sun Jan 29 17:26:52 MST 2017]; root of context hierarchy",
    "endOfBatch": false,
    "loggerFqcn": "org.apache.commons.logging.impl.SLF4JLocationAwareLog",
    "threadId": 19,
    "threadPriority": 5
}
For Spring 2.4.4 and reactive kafka, adding the following to application.properties, reduced a lot of console messages
logging.level.org.apache.kafka=OFF
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With