I've seen a number of similar questions on Stackoverflow, including this one. But none address my particular issue.
The application is deployed in a Kubernetes (v1.15) cluster. I'm using a docker image based on the fluent/fluentd-docker-image GitHub repo, v1.9/armhf
, modified to include the elasticsearch plugin. Elasticsearch and Kibana are both version 7.6.0
.
The logs are going to stdout and look like:
{"Application":"customer","HTTPMethod":"GET","HostName":"","RemoteAddr":"10.244.4.154:51776","URLPath":"/customers","level":"info","msg":"HTTP request received","time":"2020-03-10T20:17:32Z"}
In Kibana I'm seeing something like this:
{
"_index": "logstash-2020.03.10",
"_type": "_doc",
"_id": "p-UZxnABBcooPsDQMBy_",
"_version": 1,
"_score": null,
"_source": {
"log": "{\"Application\":\"customer\",\"HTTPMethod\":\"GET\",\"HostName\":\"\",\"RemoteAddr\":\"10.244.4.154:46160\",\"URLPath\":\"/customers\",\"level\":\"info\",\"msg\":\"HTTP request received\",\"time\":\"2020-03-10T20:18:18Z\"}\n",
"stream": "stdout",
"docker": {
"container_id": "cd1634b0ce410f3c89fe63f508fe6208396be87adf1f27fa9d47a01d81ff7904"
},
"kubernetes": {
I'm expecting to see the JSON pulled from the log:
value somewhat like this (abbreviated):
{
"_index": "logstash-2020.03.10",
...
"_source": {
"log": "...",
"Application":"customer",
"HTTPMethod":"GET",
"HostName":"",
"RemoteAddr":"10.244.4.154:46160",
"URLPath":"/customers",
"level":"info",
"msg":"HTTP request received",
"time":"2020-03-10T20:18:18Z",
"stream": "stdout",
"docker": {
"container_id": "cd1634b0ce410f3c89fe63f508fe6208396be87adf1f27fa9d47a01d81ff7904"
},
"kubernetes": {
My fluentd config is:
match fluent.**>
@type null
</match>
<source>
@type tail
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S.%NZ
tag kubernetes.*
format json
read_from_head true
</source>
<match kubernetes.var.log.containers.**fluentd**.log>
@type null
</match>
<match kubernetes.var.log.containers.**kube-system**.log>
@type null
</match>
<filter kubernetes.**>
@type kubernetes_metadata
</filter>
<match **>
@type elasticsearch
@id out_es
@log_level info
include_tag_key true
host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
path "#{ENV['FLUENT_ELASTICSEARCH_PATH']}"
<format>
@type json
</format>
</match>
I'm sure I'm missing something. Can anyone point me in the right direction?
Thanks, Rich
This config worked for me:
<source>
@type tail
path /var/log/containers/*.log,/var/log/containers/*.log
pos_file /opt/bitnami/fluentd/logs/buffers/fluentd-docker.pos
tag kubernetes.*
read_from_head true
<parse>
@type json
time_key time
time_format %iso8601
</parse>
</source>
<filter kubernetes.**>
@type parser
key_name "$.log"
hash_value_field "log"
reserve_data true
<parse>
@type json
</parse>
</filter>
<filter kubernetes.**>
@type kubernetes_metadata
</filter>
Make sure to edit path so that it matches your use case.
This happens because docker logs in /var/log/containers/*.log
put container STDOUT under 'log' key as string, so to put those JSON logs there as strings they must be first serialized to strings. What you need to do is to add an additional step that will parse this string under 'log' key:
<filter kubernetes.**>
@type parser
key_name "$.log"
hash_value_field "log"
reserve_data true
<parse>
@type json
</parse>
</filter>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With