I am trying to load a dataset stored on HDFS (textfile) into hive for analysis. I am using create external table as follows:
CREATE EXTERNAL table myTable(field1 STRING...)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
LOCATION '/user/myusername/datasetlocation';
This works fine, but it requires write access to the hdfs location. Why is that?
In general, what is the right way to load text data to which I do not have write access? is there a 'read-only' external table type?
Edit: I noticed this issue on hive regarding the question. It does not seem to have been resolved.
Partially answering my own question:
Indeed it seems not to be resolved by hive at this moment. But here is an interesting fact: hive does not require write access to the files themselves, but only to the folder. For example, you could have a folder with permissions 777
, but the files within it, which are accessed by hive, can stay read-only, e.g. 644
.
I don't have a solution to this, but as a workaround I've discovered that
CREATE TEMPORARY EXTERNAL TABLE
works without write permissions, the difference being the table (but not the underlying data) will disappear after your session.
If you require write access to hdfs files give hadoop dfs -chmod 777 /folder name
this means your giving all access permissions to that particular file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With