Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all files that does not have the following extensions in Linux

Tags:

linux

find

ssh

rm

I have a list of extensions:

avi,mkv,wmv,mp4,mp5,flv,M4V,mpeg,mov,m1v,m2v,3gp,avchd

I want to remove all files without the following extensions aswell as files without extension in a directory in linux.

How can I do this using the rm linux command?

like image 388
Tike Avatar asked Dec 27 '11 07:12

Tike


People also ask

How do I delete all files except a specific file extension?

Microsoft WindowsOpen File Explorer. Browse to the folder containing the files. Click the Type column heading to sort all files by the type of files. Highlight all the files you want to keep by clicking the first file type, hold down the Shift key, and click the last file.

How do I delete all file extensions in Linux?

To remove files with a specific extension, we use the 'rm' (Remove) command, which is a basic command-line utility for removing system files, directories, symbolic links, device nodes, pipes, and sockets in Linux. Here, 'filename1', 'filename2', etc. are the names of the files including full path.

How do I delete a file that has no extension?

Use Del *. in batch file to remove files with no extension.

How do I remove all file extensions from a directory in Linux?

Using rm Command The 'rm' command is a basic command-line utility in Linux to remove sockets, pipes, device nodes, symbolic links, directories, system files, etc. To remove a file with a particular extension, use the command 'rm'. This command is very easy to use, and its syntax is something like this.


2 Answers

You will first have to find out files that do not contain those extension. You can do this very easily with the find command. You can build on the following command -

find /path/to/files ! -name "*.avi" -type f -exec rm -i {} \;

You can also use -regex instead of -name to feed in complex search pattern. ! is to negate the search. So it will effectively list out those files that do not contain those extensions.

It is good to do rm -i as it will list out all the files before deleting. It may become tedious if your list is comprehensive so you can decide yourself to include it or not.

Deleting tons of files using this can be dangerous. Once deleted you can never get them back. So make sure you run the find command without the rm first to inspect the list throughly before deleting them.

Update:

As stated in the comments by aculich, you can also do the following -

find /path/to/files ! -name "*.avi" -type f -delete

-type f will ensure that it will only find and delete regular files and will not touch any directories, sym links etc.

like image 154
jaypal singh Avatar answered Sep 21 '22 21:09

jaypal singh


You can use a quick and dirty rm command to accomplish what you want, but keep in mind it is error-prone, non-portable, dangerous, and has severe limitations.

As others have suggested, you can use the find command. I would recommend using find rather than rm in almost all cases.

Since you mention that you are on a Linux system I will use the GNU implementation that is part of the findutils package in my examples because it is the default on most Linux systems and is what I generally recommend learning since it has a much richer and more advanced set of features than many other implementations.

Though it can be daunting and seemingly over-complicated it is worth spending time to master the find command because it gives you a kind of precise expressiveness and safety that you won't find with most other methods without essentially (poorly) re-inventing this command!

Find Example

People often suggest using the find command in inefficient, error-prone and dangerous ways, so below I outline a safe and effective way to accomplish exactly what you asked for in your example.

Before deleting files I recommend previewing the file list first (or at least part of the list if it is very long):

find path/to/files -type f -regextype posix-extended -iregex '.*\.(avi|mkv|wmv|mp4|mp5|flv|M4V|mpeg|mov|m1v|m2v|3gp|avchd)$'

The above command will show you the list of files that you will be deleting. To actually delete the files you can simply add the -delete action like so:

find path/to/files -type f -regextype posix-extended -iregex '.*\.(avi|mkv|wmv|mp4|mp5|flv|M4V|mpeg|mov|m1v|m2v|3gp|avchd)$' -delete

If you would like to see what will remain you can invert the matches in the preview by adding ! to the preview command (without the -delete) like so:

find path/to/files -type f -regextype posix-extended ! -iregex '.*\.(avi|mkv|wmv|mp4|mp5|flv|M4V|mpeg|mov|m1v|m2v|3gp|avchd)$'

The output of this inverse match should be the same as the output you will see when listing the files after performing the delete unless errors occurred due to permissions problems or unwritable filesystems:

find path/to/files -type f

Explanation

Here I will explain in some depth the options I chose and why:

I added -type f to restrict the matches to files-only; without that it will match non-files such as directories which you probably don't want. Also note that I put it at the beginning rather than the end because order of predicates can matter for speed; with -type f first it will execute the regex check against files-only instead of against everything... in practice it may not matter much unless you have lots of directories or non-files. Still, it's worth keeping order of predicates in mind since it can have a significant impact in some cases.

I use the case-insensitive -iregex option as opposed to the case-sensitive -regex option because I assumed that you wanted to use case-insensitive matching so it will include both .wmv and .WMV files.

You'll probably want to use extend POSIX regular expressions for simplicity and brevity. Unfortunately there is not yet a short-hand for -regextype posix-extended, but even still I would recommend using it because you can avoid the problem of having to add lots of \ backslashes to escape things in longer, more complex regexes and it has more advanced (modern) features. The GNU implementation defaults to emacs-style regexes which can be confusing if you're not used to them.

The -delete option should make obvious sense, however sometimes people suggest using the slower and more complex -exec rm {} \; option, but usually that is because they are not aware of the safer, faster, and easier -delete option (and in rare cases you may encounter old systems with an ancient version of find that does not have this option). It is useful to know that -exec exists, but use -delete where you can for deleting files. Also, do not pipe | the output of find to another program unless you use and understand the -print0 option, otherwise you're in for a world of hurt when you encounter files with spaces.

The path/to/files argument I included explicitly. If you leave it out it will implicitly use . as the path, however it is safer (especially with a -delete) to state the path explicitly.

Alternate find Implementations

Even though you said you're on a Linux system I will also mention the differences that you'll encounter with the BSD implementations which includes Mac OS X! For other systems (like older Solaris boxes), good luck! Upgrade to one of the more modern find variants!

The main difference in this example is regarding regular expressions. The BSD variants use basic POSIX regular expressions by default. To avoid burdensome extra escaping in regexes required for basic-PRE you can take advantage of more modern features of extended-PRE by specifying the -E option with the BSD variant to achieve the same behavior as the GNU variant that uses -regextype posix-extended.

find -E path/to/files -iregex '.*\.(avi|mkv|wmv|mp4|mp5|flv|M4V|mpeg|mov|m1v|m2v|3gp|avchd)$' -type f

Note in this case that the -E option comes before the path/to/files whereas the -regextype posix-extended option for GNU comes after the path.

It is too bad that GNU does not yet provide a -E option (yet!); since I think it would be a useful option to have parity with the BSD variants I will submit a patch to findutils to add this option and if it is accepted I will update this answer accordingly.

rm - Not Recommended

Though I strongly recommend against using rm, I will give examples of how to accomplish more or less what your question specifically asked (with some caveats).

Assuming you use a shell with Bourne syntax (which is usually what you find on Linux system which default to the Bash shell) you can use this command:

for ext in avi mkv wmv mp4 mp5 flv M4V mpeg mov m1v m2v 3gp avchd; do rm -f path/to/files/*.$ext; done

If you use Bash and have extended globbing turned on with shopt -s extglob then you can use Pattern Matching with Filename Expansion:

rm -f path/to/files/*.+(avi|mkv|wmv|mp4|mp5|flv|M4V|mpeg|mov|m1v|m2v|3gp|avchd)

The +(pattern-list) extended globbing syntax will match one or more occurrences of the given patterns.

However, I strongly recommend against using rm because:

It is error-prone and dangerous because it is easy to accidentally put a space between the *'s which means you will delete everything; you cannot preview the result of the command ahead of time; it is fire-and-forget, so good luck with the aftermath.

It is non-portable because even if it happens to work in your particular shell, the same command line may not work in other shells (including other Bourne-shell variants if you are prone to using Bash-isms).

It has severe limitations because if you have files that are nested in subdirectories or even just lots of files in a single directory, then you will quickly hit the limits on command line length when using file globbing.

I wish the rm command would just rm itself into oblivion because I can think of few places where I'd rather use rm instead of (even ancient implementations of) find.

like image 27
aculich Avatar answered Sep 22 '22 21:09

aculich