Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logstash filter remove_field for all fields except a specified list of fields

I am parsing a set of data into an ELK stack for some non-tech folks to view. As part of this, I want to remove all fields except a specific known subset of fields from the events before sending into ElasticSearch.

I can explicitly specify each field to drop in a mutate filter like so:

filter {
    mutate {
        remove_field => [ "throw_away_field1", "throw_away_field2" ]
    }
}

In this case, anytime a new field gets added to the input data (which can happen often since the data is pulled from a queue and used by multiple systems for multiple purposes) it would require an update to the filtering, which is extra overhead that's not needed. Not to mention if some sensitive data made it through between when the input streams were updated and when the filtering was updated, that could be bad.

Is there a way using the logstash filter to iterate over each field of an object, and remove_field if it is not in a provided list of field names? Or would I have to write a custom filter to do this? Basically, for every single object, I just want to keep 8 specific fields, and toss absolutely everything else.

It looks like very minimal if ![field] =~ /^value$/ type logic is available in the logstash.conf file, but I don't see any examples that would iterate over the fields themselves in a for each style and compare the field name to a list of values.

Answer:

After upgrading logstash to 1.5.0 to be able to use plugin extensions such as prune, the solution ended up looking like this:

filter {
    prune {
        interpolate => true
        whitelist_names => ["fieldtokeep1","fieldtokeep2"]
    }
}
like image 539
redstonemercury Avatar asked Oct 28 '15 18:10

redstonemercury


2 Answers

Prune whitelist should be what you're looking for.

For more specific control, dropping to the ruby filter is probably the next step.

like image 68
Alain Collins Avatar answered Sep 25 '22 14:09

Alain Collins


Another option would be to move parsed json into new a field and than use mutate,e.g:

filter {
   json {
      source => "json"
      target => "parsed_json"
   }

   mutate {
      add_field => {"nested_field" => "%{[parsed_json][nested_field]}"}
      remove_field => [ "json", "parsed_json" ]
   }
}
like image 23
user6751687 Avatar answered Sep 23 '22 14:09

user6751687