Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Import JSON into ClickHouse

I create table with this statement:

CREATE TABLE event(
    date Date,
    src UInt8,
    channel UInt8,
    deviceTypeId UInt8,
    projectId UInt64,
    shows UInt32,
    clicks UInt32,
    spent Float64
) ENGINE = MergeTree(date, (date, src, channel, projectId), 8192);

Raw data looks like:

{ "date":"2016-03-07T10:00:00+0300","src":2,"channel":18,"deviceTypeId ":101, "projectId":2363610,"shows":1232,"clicks":7,"spent":34.72,"location":"Unknown", ...}
...

Files with data loaded with the following command:

cat *.data|sed 's/T[0-9][0-9]:[0-9][0-9]:[0-9][0-9]+0300//'| clickhouse-client --query="INSERT INTO event FORMAT JSONEachRow"

clickhouse-client throw exception:

Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: location: (at row 1)

Is it possible to skip fields from JSON object that not presented in table description?

like image 369
Vers_us Avatar asked Sep 08 '16 11:09

Vers_us


2 Answers

The latest ClickHouse release (v1.1.54023) supports input_format_skip_unknown_fields user option which eneables skipping of unknown fields for JSONEachRow and TSKV formats.

Try

clickhouse-client -n --query="SET input_format_skip_unknown_fields=1; INSERT INTO event FORMAT JSONEachRow;"

See more details in documentation.

like image 197
Vitaliy L. Avatar answered Oct 10 '22 13:10

Vitaliy L.


Currently, it is not possible to skip unknown fields.

You may create temporary table with additional field, INSERT data into it, and then do INSERT SELECT into final table. Temporary table may have Log engine and INSERT into that "staging" table will work faster than into final MergeTree table.

It is relatively easy to add possibility to skip unknown fields into code (something like setting 'format_skip_unknown_fields').

like image 36
uYSIZfoz Avatar answered Oct 10 '22 13:10

uYSIZfoz