Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to load a text file into a Hive table stored as sequence files

Tags:

hadoop

hive

I have a hive table stored as a sequencefile.

I need to load a text file into this table. How do I load the data into this table?

like image 430
cldo Avatar asked Dec 28 '12 03:12

cldo


People also ask

How will you load data from local file to Hive table?

LOAD DATA [LOCAL] INPATH '<The table data location>' [OVERWRITE] INTO TABLE <table_name>; Note: The LOCAL Switch specifies that the data we are loading is available in our Local File System. If the LOCAL switch is not used, the hive will consider the location as an HDFS path location.

How do I import data into Hive?

You enter the Sqoop import command on the command line of your Hive cluster to import data from a data source to Hive. You can test the import statement before actually executing it. Apache Sqoop is installed and configured. A Hive Metastore is associated with your HDFS cluster.


2 Answers

You can load the text file into a textfile Hive table and then insert the data from this table into your sequencefile.

Start with a tab delimited file:

% cat /tmp/input.txt a       b a2      b2 

create a sequence file

hive> create table test_sq(k string, v string) stored as sequencefile; 

try to load; as expected, this will fail:

hive> load data local inpath '/tmp/input.txt' into table test_sq; 

But with this table:

hive> create table test_t(k string, v string) row format delimited fields terminated by '\t' stored as textfile; 

The load works just fine:

hive> load data local inpath '/tmp/input.txt' into table test_t; OK hive> select * from test_t; OK a       b a2      b2 

Now load into the sequence table from the text table:

insert into table test_sq select * from test_t; 

Can also do load/insert with overwrite to replace all.

like image 184
libjack Avatar answered Sep 18 '22 13:09

libjack


You cannot directly create a table stored as a sequence file and insert text into it. You must do this:

  1. Create a table stored as text
  2. Insert the text file into the text table
  3. Do a CTAS to create the table stored as a sequence file.
  4. Drop the text table if desired

Example:

CREATE TABLE test_txt(field1 int, field2 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';  LOAD DATA INPATH '/path/to/file.tsv' INTO TABLE test_txt;  CREATE TABLE test STORED AS SEQUENCEFILE AS SELECT * FROM test_txt;  DROP TABLE test_txt; 
like image 42
Michael Stratton Avatar answered Sep 19 '22 13:09

Michael Stratton