I'm trying to request database with logstash jdbc plugins and returns a csv output file with headers with logstash csv plugin.
I spent a lot of time on logstash documentation but I'm still missing a point.
With the following logstash configuration, the results give me a file with headers for each row. I couldn't find a way to add the headers for only the first row in the logstash configuration.
Helps very much appreciated.
_object$id;_object$name;_object$type;nb_surveys;csat_score
2;Jeff Karas;Agent;2;2
_object$id;_object$name;_object$type;nb_surveys;csat_score
3;John Lafer;Agent;2;2;2;2;$2;2
_object$id;_object$name;_object$type;nb_surveys;csat_score
4;Michele Fisher;Agent;2;2
_object$id;_object$name;_object$type;nb_surveys;csat_score
5;Chad Hendren;Agent;2;78
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://localhost:5432/postgres"
jdbc_user => "postgres"
jdbc_password => "postgres"
jdbc_driver_library => "/tmp/drivers/postgresql/postgresql_jdbc.jar"
jdbc_driver_class => "org.postgresql.Driver"
statement_filepath => "query.sql"
}
}
output {
csv {
fields => ["_object$id","_object$name","_object$type","nb_surveys","csat_score"]
path => "output/%{team}/output-%{team}.%{+yyyy.MM.dd}.csv"
csv_options => {
"write_headers" => true
"headers" =>["_object$id","_object$name","_object$type","nb_surveys","csat_score"]
"col_sep" => ";"
}
}
}
Thanks
The reason why you are getting multiple headers in the output is because Logstash has no concept of global/shared state between events, each item is handled in isolation so every time the CSV output plugin runs it behaves like the first one and writes the headers.
I had the same issue and found a solution using the init option of the ruby filter to execute some code at logstash startup-time.
Here is an example logstash config:
# csv-headers.conf
input {
stdin {}
}
filter {
ruby {
init => "
begin
@@csv_file = 'output.csv'
@@csv_headers = ['A','B','C']
if File.zero?(@@csv_file) || !File.exist?(@@csv_file)
CSV.open(@@csv_file, 'w') do |csv|
csv << @@csv_headers
end
end
end
"
code => "
begin
event['@metadata']['csv_file'] = @@csv_file
event['@metadata']['csv_headers'] = @@csv_headers
end
"
}
csv {
columns => ["a", "b", "c"]
}
}
output {
csv {
fields => ["a", "b", "c"]
path => "%{[@metadata][csv_file]}"
}
stdout {
codec => rubydebug {
metadata => true
}
}
}
If you run Logstash with that config:
echo "1,2,3\n4,5,6\n7,8,9" | ./bin/logstash -f csv-headers.conf
You will get an output.csv
file with this content:
A,B,C
1,2,3
4,5,6
7,8,9
This is also thread-safe because it runs the code on startup only, so you can use multiple workers.
Hope it helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With