Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Importing CSV file into Hadoop




I am new with Hadoop, I have a file to import into hadoop via command line (I access the machine through SSH)

How can I import the file in hadoop? How can I check afterward (command)?

like image 882
akaliza Avatar asked Dec 14 '15 21:12


2 Answers

2 steps to import csv file

  1. move csv file to hadoop sanbox (/home/username) using winscp or cyberduck.
  2. use -put command to move file from local location to hdfs.

        hdfs dfs -put /home/username/file.csv /user/data/file.csv
like image 159
Sam Avatar answered Sep 27 '22 20:09


There are three flags that we can use for load data from local machine into HDFS,


We use this flag to copy data from the local file system to the Hadoop directory.

hdfs dfs –copyFromLocal /home/username/file.csv /user/data/file.csv

If the folder is not created as HDFS or root user we can create the folder:

hdfs dfs -mkdir /user/data


As @Sam mentioned in the above answer we also use -put flag to copy data from the local file system to the Hadoop directory.

hdfs dfs -put /home/username/file.csv /user/data/file.csv


we also use -moveFromLocal flag to copy data from the local file system to the Hadoop directory. But this will remove the file from the local directory

hdfs dfs -moveFromLocal /home/username/file.csv /user/data/file.csv
like image 41
INDRAJITH Avatar answered Sep 27 '22 21:09