Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Importing CSV file into Hadoop

Tags:

csv

hadoop2

I am new with Hadoop, I have a file to import into hadoop via command line (I access the machine through SSH)

How can I import the file in hadoop? How can I check afterward (command)?

like image 882
akaliza Avatar asked Dec 14 '15 21:12

akaliza


2 Answers

2 steps to import csv file

  1. move csv file to hadoop sanbox (/home/username) using winscp or cyberduck.
  2. use -put command to move file from local location to hdfs.

        hdfs dfs -put /home/username/file.csv /user/data/file.csv
    
like image 159
Sam Avatar answered Sep 27 '22 20:09

Sam


There are three flags that we can use for load data from local machine into HDFS,

-copyFromLocal

We use this flag to copy data from the local file system to the Hadoop directory.

hdfs dfs –copyFromLocal /home/username/file.csv /user/data/file.csv

If the folder is not created as HDFS or root user we can create the folder:

hdfs dfs -mkdir /user/data

-put

As @Sam mentioned in the above answer we also use -put flag to copy data from the local file system to the Hadoop directory.

hdfs dfs -put /home/username/file.csv /user/data/file.csv

-moveFromLocal

we also use -moveFromLocal flag to copy data from the local file system to the Hadoop directory. But this will remove the file from the local directory

hdfs dfs -moveFromLocal /home/username/file.csv /user/data/file.csv
like image 41
INDRAJITH Avatar answered Sep 27 '22 21:09

INDRAJITH