Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fluentd nested json parsing

Tags:

json

fluentd

I have logs like the following:

{
  "log": {
    "header": {
      "key": "value",
      "nested": "{\"key1\":\"value\",\"key2\":\"value\"}",
      "dateTime": "2019-05-08T20:58:06+00:00"
    },
    "body": {
      "path": "/request/path/",
      "method": "POST",
      "ua": "curl/7.54.0",
      "resp": 200
    }
  }
}

I'm trying to aggregate logs using fluentd and I want the entire record to be JSON. The specific problem is the "$.log.header.nested" field, which is a JSON string. How can I parse and replace that string with its contents?

For clarity, I'd like the logs output by fluentd to look like this:

{
  "log": {
    "header": {
      "key": "value",
      "nested": {
          "key1": "value",
          "key2": "value"
      },
      "dateTime": "2019-05-08T20:58:06+00:00"
    },
    "body": {
      "path": "/request/path/",
      "method": "POST",
      "ua": "curl/7.54.0",
      "resp": 200
    }
  }
}

I've found a way to parse the nested field as JSON, but storing to back to the same key it was parsed from isn't clear. It doesn't seem like hash_value_field supports storing to a nested key. Is there some other way to accomplish this?

like image 222
shadfc Avatar asked May 08 '19 21:05

shadfc


People also ask

How does the JSON parser plugin work?

The json parser plugin parses JSON logs. One JSON map per line. See Parse Section Configurations. Sets the JSON parser. If you have a problem with the configured parser, check the other available parser types.

How to enable JSON_parser OJ in Fluentd?

NOTE: If you want to enable json_parser oj by default, The oj gem must be installed separately. This is because oj gem is not required from fluentd by default. If oj gem is not installed, yajl is used as a fallback. Set the buffer size that Yajl will use when parsing streaming input. json parser changes the default value of time_type to float.

What is parser filter plugin in Fluentd?

The parser filter plugin "parses" string field in event records and mutates its event record with the parsed result. It is included in the Fluentd's core. expression /^ (?<host> [^ ]*) [^ ]* (?<user> [^ ]*) \ [ (?<time> [^\]]*)\] " (?<method>\S+) (?: + (?<path> [^ ]*) +\S*)?"

How to parse streaming input in YajL using JSON?

Set the buffer size that Yajl will use when parsing streaming input. json parser changes the default value of time_type to float. If you want to parse string field, set time_type and time_format like this:


Video Answer


1 Answers

The following config seems to accomplish what I want. However, I'm not sure if this is the best way. I assume using ruby is far less performant. Any improvements to this are welcome.

<filter logs>
  @type parser
  key_name "$.log.header.nested"
  hash_value_field "parsed_nested"
  reserve_data true
  remove_key_name_field true
  <parse>
    @type json
  </parse>
</filter>

<filter logs>
  @type record_transformer
  enable_ruby true
  <record>
    parsed_nested ${record["log"]["header"]["nested"] = record["parsed_nested"]}
  </record>
  remove_keys parsed_nested
</filter>
like image 52
shadfc Avatar answered Nov 20 '22 08:11

shadfc