Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

youtube-dl filename formatting lowercase and dashes

I'm using youtube-dl frequently and have a quite simple file-naming scheme: lower case only and things of the same group are connected with "-" (minus, dash, etc.) while different things are connected with "_" (underscores).

I'm not into regex and therefore, really puzzled if it is possible to configure youtube-dl config-file to store the downloaded clips according to my naming scheme. E.g.:

video:

youtube-dl https://www.youtube.com/watch?v=X8uPIquE5Oo

my youtube-dl config:

--output '~/videos/%(uploader)s_%(upload_date)s_%(title)s.%(ext)s' --restrict-filenames

my output:

Queen_Forever_20111202_Bohemian_Rhapsody_Live_at_Wembley_11-07-1986.mp4

desired output:

queen-forever_20111202_bohemian-rhapsody-live-at-wembley-11-07-1986.mp4

NB: The manual says there are possible python options, but I cannot transfer them to my case.

like image 622
alex Avatar asked Nov 16 '22 14:11

alex


1 Answers

I don't think it's possible via the string formatting options, but you can call youtube-dl from a python script. So, with a small python wrapper it's possible.

Youtube DL Documentation

import youtube_dl
import os


def correct_file_naming(response):
    if response['status'] == 'finished':
        directory = os.path.dirname(response['filename'])
        # Replace `_` with `-` first
        # Then replace `separator` with `_`
        new_filename = os.path.basename(response['filename'])\
            .replace("_", "-")\
            .replace(separator, "_")\
            .lower()

        os.rename(response['filename'], os.path.join(directory, new_filename))


if __name__ == "__main__":
    # Just a random string that won't appear naturally in the filename.
    # Required since the dynamic fields (e.g. title) are using characters
    # that are going to be used as the seperator when we're done
    separator = '@#@'
    download_path = '~/videos/'

    # see link for options:
    # https://github.com/ytdl-org/youtube-dl/blob/3e4cedf9e8cd3157df2457df7274d0c842421945/youtube_dl/YoutubeDL.py#L137-L312
    ydl_opts = {
        'outtmpl': f'{download_path}%(uploader)s{separator}%(upload_date)s{separator}%(title)s.%(ext)s',
        'restrictfilenames': True,
        'progress_hooks': [correct_file_naming]
    }

    url = 'https://www.youtube.com/watch?v=X8uPIquE5Oo'

    with youtube_dl.YoutubeDL(ydl_opts) as ydl:
        ydl.download([url])
  1. correct_file_naming will get called several times during the process of downloading. So the if statement ensures it's finished before we start trying to rename the file. It'd cause a few issues if this is done prematurely.
  2. The separator variable is a small hack. It's used as a temp variable, since there's a conflict for the _ character. You want the separator to be _, but the dynamic options (e.g. title/uploader) are using that character. So during the renaming process we keep them separate to ensure we get everything right.

If you wanted you could expose all the parameters through the python script so this becomes nothing more than a small wrapper around the youtube-dl code. Then it should easily integrate into any existing scripts you have by just pointing them at the wrapper.

like image 111
Mock Coder Avatar answered Dec 28 '22 07:12

Mock Coder