Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

javascript regex for extracting filename from Content-Disposition header

Content-disposition header contains filename which can be easily extracted, but sometimes it contains double quotes, sometimes no quotes and there are probably some other variants too. Can someone write a regex which works in all the cases.

Content-Disposition: attachment; filename=content.txt

Here are some of the possible target strings:

attachment; filename=content.txt
attachment; filename*=UTF-8''filename.txt
attachment; filename="EURO rates"; filename*=utf-8''%e2%82%ac%20rates
attachment; filename="omáèka.jpg"
and some other combinations might also be there
like image 945
adnan kamili Avatar asked Apr 14 '14 07:04

adnan kamili


3 Answers

You could try something in this spirit:

filename[^;=\n]*=((['"]).*?\2|[^;\n]*)

filename      # match filename, followed by
[^;=\n]*      # anything but a ;, a = or a newline
=
(             # first capturing group
    (['"])    # either single or double quote, put it in capturing group 2
    .*?       # anything up until the first...
    \2        # matching quote (single if we found single, double if we find double)
|             # OR
    [^;\n]*   # anything but a ; or a newline
)

Your filename is in the first capturing group: http://regex101.com/r/hJ7tS6

like image 81
Robin Avatar answered Nov 01 '22 21:11

Robin


Slightly modified to match my use case (strips all quotes and UTF tags)

filename\*?=['"]?(?:UTF-\d['"]*)?([^;\r\n"']*)['"]?;?

https://regex101.com/r/UhCzyI/3

like image 16
h0wXD Avatar answered Nov 01 '22 21:11

h0wXD


/filename[^;=\n]*=(?:(\\?['"])(.*?)\1|(?:[^\s]+'.*?')?([^;\n]*))/i

https://regex101.com/r/hJ7tS6/51

Edit: You can also use this parser: https://github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js

like image 10
def00111 Avatar answered Nov 01 '22 22:11

def00111