Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete data from .Trash in hdfs

Tags:

hive

hdfs

When I delete data from hdfs if goes to .Trash.

I can either use

hdfs dfs -rm -skipTrash /path_to_directory

or use

hdfs dfs -expunge

to remove data already in .Trash

Two questions:

  1. While dropping table in Hive - do the tables go to .Trash or they are removed permanently?

  2. Is this possible to remove data from .Trash for all users? Assuming I have required privileges...

Pawel

like image 254
psmith Avatar asked Sep 24 '18 10:09

psmith


People also ask

Can we delete data from HDFS?

You will find rm command in your Hadoop fs command. This command is similar to the Linux rm command, and it is used for removing a file from the HDFS file system. The command –rmr can be used to delete files recursively.

How do I delete a file in Hadoop folder?

Log into the Hadoop NameNode using the database administrator's account and use HDFS's rmr command to delete the directories.

What is skipTrash in Hadoop?

If the -skipTrash option is specified, the trash, if enabled, will be bypassed and the specified file(s) deleted immediately. This can be useful when it is necessary to delete files from an over-quota directory. Example: hdfs dfs -rmr /user/hadoop/dir.


1 Answers

If you are dropping an external table, nothing is going to happen as it doesn't delete any data.

For managed tables, yes all the data files will go to the .Trash directory of the user who ran the drop command. However it's only data and not the metadata.

For other part of your question, yes you could delete the .Trash directory of any user if you have required privilege.

like image 115
Gaurang Shah Avatar answered Oct 28 '22 03:10

Gaurang Shah