Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all files older than X days, but keep at least the Y youngest [duplicate]

I have a script that removes DB dumps that are older than say X=21 days from a backup dir:

DB_DUMP_DIR=/var/backups/dbs
RETENTION=$((21*24*60))  # 3 weeks

find ${DB_DUMP_DIR} -type f -mmin +${RETENTION} -delete

But if for whatever reason the DB dump jobs fails to complete for a while, all dumps will eventually be thrown away. So as a safeguard i want to keep at least the youngest Y=7 dumps, even it all or some of them are older than 21 days.

I look for something that is more elegant than this spaghetti:

DB_DUMP_DIR=/var/backups/dbs
RETENTION=$((21*24*60))  # 3 weeks
KEEP=7

find ${DB_DUMP_DIR} -type f -printf '%T@ %p\n' | \  # list all dumps with epoch
sort -n | \                                         # sort by epoch, oldest 1st
head --lines=-${KEEP} |\                            # Remove youngest/bottom 7 dumps
while read date filename ; do                       # loop through the rest
    find $filename -mmin +${RETENTION} -delete      # delete if older than 21 days
done

(This snippet might have minor bugs - Ignore them. It's to illustrate what i can come up with myself, and why i don't like it)

Edit: The find option "-mtime" is one-off: "-mtime +21" means actually "at least 22 days old". That always confused me, so i use -mmin instead. Still one-off, but only a minute.

like image 404
Nils Toedtmann Avatar asked Dec 03 '13 18:12

Nils Toedtmann


2 Answers

Use find to get all files that are old enough to delete, filter out the $KEEP youngest with tail, then pass the rest to xargs.

find ${DB_DUMP_DIR} -type f -printf '%T@ %p\n' -mmin +$RETENTION |
  sort -nr | tail -n +$KEEP |
  xargs -r echo

Replace echo with rm if the reported list of files is the list you want to remove.

(I assume none of the dump files have newlines in their names.)

like image 93
chepner Avatar answered Oct 21 '22 14:10

chepner


I'm opening a second answer because I just I have a different solution - one using awk: just add the time to the 21 day (in seconds) period, minus the current time and remove the negative ones! (after sorting and removing the newest 7 from the list):

DB_DUMP_DIR=/var/backups/dbs
RETENTION=21*24*60*60  # 3 weeks
CURR_TIME=`date +%s`

find ${DB_DUMP_DIR} -type f -printf '%T@ %p\n' | \
  awk '{ print int($1) -'${CURR_TIME}' + '${RETENTION}' ":" $2}' | \
  sort -n | head -n -7 | grep '^-' | cut -d ':' -f 2- | xargs rm -rf
like image 44
rabensky Avatar answered Oct 21 '22 14:10

rabensky