Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Naming convention uploaded files [closed]

On my site I let users upload files.

If the file is valid and uploaded it is moved to a folder (using PHP).

All users upload to the same folder.

I think I need to rename the uploaded files.

Is there something like a default naming convention to let users upload files with the same filename?

like image 672
PeeHaa Avatar asked May 01 '11 20:05

PeeHaa


People also ask

When naming a file What should be avoided?

Don't start or end your filename with a space, period, hyphen, or underline. Keep your filenames to a reasonable length and be sure they are under 31 characters. Most operating systems are case sensitive; always use lowercase. Avoid using spaces and underscores; use a hyphen instead.

What are appropriate file naming conventions?

File naming best practices: Files should be named consistently. File names should be short but descriptive (<25 characters) (Briney, 2015) Avoid special characters or spaces in a file name. Use capitals and underscores instead of periods or spaces or slashes. Use date format ISO 8601: YYYYMMDD.

What is a poor file naming convention?

File names should only contain letters, numbers, underscores, and dashes. You should not use: periods. quotation marks. slashes.


1 Answers

There are no standard conventions, but there a couple of best-practices:


Organizing your files into (User and/or Date) Aware Folders

Something like:

/uploads/USER/ or
/uploads/[USER/]YEAR/[MONTH/[DAY/[HOUR/[MINUTE/]]]]

This will have some benefits:

  • organize files per user and/or date
  • make it harder to reach the maximum number of files per directory

(Not) Renaming / Sanitizing Filenames

Renaming or not is a choice you will have to make, depending on your website, user base, how obscure you would like to be and, obviously your architecture. Would you prefer to have a file named kate_at_the_beach.jpg or 1304357611.jpg? This is really up to you to decide, but search engines (obviouslly) like the first one better.

One thing you should do is always sanitize and normalize the filenames, personally I would only allow the following chars: 0-9, a-z, A-Z, _, -, . - if you choose this sanitation alphabet. normalization basically means just converting the filename to either lower or upper case (to avoid losing files if for instance you switch from a case sensitive file-system to a case insensitive one, like Windows).

Here is some sample code I use in phunction (shameless plug, I know :P):

$filename = '/etc/hosts/@Álix Ãxel likes - beer?!.jpg';
$filename = Slug($filename, '_', '.'); // etc_hosts_alix_axel_likes_beer.jpg

function Slug($string, $slug = '-', $extra = null)
{
    return strtolower(trim(preg_replace('~[^0-9a-z' . preg_quote($extra, '~') . ']+~i', $slug, Unaccent($string)), $slug));
}

function Unaccent($string) // normalizes (romanization) accented chars
{
    if (strpos($string = htmlentities($string, ENT_QUOTES, 'UTF-8'), '&') !== false)
    {
        $string = html_entity_decode(preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|tilde|uml);~i', '$1', $string), ENT_QUOTES, 'UTF-8');
    }

    return $string;
}

Handling Duplicate Filenames

As the documentation entry on move_uploaded_file() states:

If the destination file already exists, it will be overwritten.

So, before you call move_uploaded_file() you better check if the file already exists, if it does then you should (if you don't want to lose your older file) rename your new file, usually appending a time / random / unique token before the file extension, doing something like this:

if (file_exists($output . $filename) === true)
{
    $token = '_' . time(); // see below
    $filename = substr_replace($filename, $token, strrpos($filename, '.'), 0);
}

move_uploaded_file($_FILES[$input]['tmp_name'], $output . $filename);

This will have the effect of inserting the $token before the file extension, like I stated above. As for the choice of the $token value you have several options:

  • time() - ensures uniqueness every second but sucks handling duplicate files
  • random - not a very good idea, since it doesn't ensure uniqueness and doesn't handle duplicates
  • unique - using an hash of the file contents is my favorite approach, since it guarantees content uniqueness and saves you HD space since you'll only have at most 2 identical files (one with the original filename and another one with the hash appended), sample code:

(Dummy text so that the next line gets formatted as code.)

$token = '_' . md5_file($_FILES[$input]['tmp_name']);

Hope it helps! ;)

like image 181
Alix Axel Avatar answered Oct 09 '22 12:10

Alix Axel