Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

import csv into elasticsearch

Tags:

I'm doing "elastic search getting started" tutorial. Unfortunatelly this tutorial doesn't cover first step which is importing csv database into elasticsearch.

I googled to find solution but it doesn't work unfortunatelly. Here is what I want to achieve and what I have:

I have a file with data which I want to import (simplified)

id,title 10,Homer's Night Out 12,Krusty Gets Busted 

I would like to import it using logstash. After research over the internet I end up with following config:

input {     file {         path => ["simpsons_episodes.csv"]         start_position => "beginning"     } }  filter {     csv {         columns => [             "id",             "title"         ]     } }  output {     stdout { codec => rubydebug }     elasticsearch {         action => "index"         hosts => ["127.0.0.1:9200"]         index => "simpsons"         document_type => "episode"         workers => 1     } } 

I have a trouble with specifying document type so once data is imported and I navigate to http://localhost:9200/simpsons/episode/10 I expect to see result with episode 10.

like image 698
adelura Avatar asked Apr 29 '17 22:04

adelura


People also ask

How do I import a CSV file into Elasticsearch without Logstash?

The csv ingest processor will only work on a JSON document that contains a field with CSV data. You cannot throw raw CSV data at it using curl. The CSV to JSON transformation happens in Kibana (when you drop the raw CSV file in the browser window) and only then Kibana will send JSON-ified CSV. That's the way it works.

Does Elasticsearch support CSV?

Above Elasticsearch document from 'crimes' index is representing a single record from the CSV file. In this way you can push any CSV data into Elasticsearch and then can perform search, analytics, or create dashboards using that data.


2 Answers

Good job, you're almost there, you're only missing the document ID. You need to modify your elasticsearch output like this:

elasticsearch {     action => "index"     hosts => ["127.0.0.1:9200"]     index => "simpsons"     document_type => "episode"     document_id => "%{id}"             <---- add this line     workers => 1 } 

After this you'll be able to query episode with id 10

GET http://localhost:9200/simpsons/episode/10 
like image 69
Val Avatar answered Sep 22 '22 19:09

Val


I'm the author of moshe/elasticsearch_loader
I wrote ESL for this exact problem.
You can download it with pip:

pip install elasticsearch-loader 

And then you will be able to load csv files into elasticsearch by issuing:

elasticsearch_loader --index incidents --type incident csv file1.csv 

Additionally, you can use custom id file by adding --id-field=document_id to the command line

like image 20
MosheZada Avatar answered Sep 25 '22 19:09

MosheZada