Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Uploading PDF or .doc and security

I have a script that lets the user upload text files (PDF or doc) to the server, then the plan is to convert them to raw text. But until the file is converted, it's in its raw format, which makes me worried about viruses and all kinds of nasty things.

Any ideas what I need to do to minimize the risk of these unknown files. How to check if it's clean, or if it's even the format it claims to be and that it does not crash the server.

like image 427
Kamo Avatar asked Jan 22 '23 00:01

Kamo


2 Answers

As I commented to Aerik but it's really the answer to the question.

If you have PHP >= 5.3 use finfo_file(). If you have an older version of PHP you can use mime_content_type() (less reliable) or load the Fileinfo extension from PECL.

Both of these functions return the mime type of the file (by looking at the type of data inside them). For PDF it should be

text/pdf

For a word doc it could be a few things. Generally it should be

application/msword

If your server is running *nix then make sure the files you're saving aren't executable. Even better: save them to a folder that isn't accessible by the web server. You can still write code to access the files but someone requesting a web page won't be able to access them at all.

like image 186
Cfreak Avatar answered Jan 30 '23 14:01

Cfreak


If you've ever opened or executed any user-uploaded file on the server, you should expect that your server is now compromised.

Even a JPG can contain executable php. If you include or require the file in any way in your script, that can also compromise your server. An image you stumble upon on the web served like so...


header('Content-type: image/jpeg');
header('Content-Disposition: inline; filename="test.jpg"');

echo file_get_contents('/some_image.jpg');
echo '<?php phpinfo(); ?>';

... which you save and re-host on your own server like so...


$q = $_GET['q']; // pretend this is sanitized for the moment
header('Content-type: '.mime_content_type($q));
header('Content-Disposition: inline; filename="'.$_GET['q'].'"');

include $q;

...will execute phpinfo() on your server. Your site users can then simply save the image to their desktop and open it with notepad to see your server settings. Simply converting the file to another format will discard that script, and should not trigger any actual virus attached to the file.

It might also be best to do a virus search on upload. You should be able to do an inline system command to a checker and parse its output to see if it finds any. Your site users should be checking files they download anyway.

Otherwise, even a virus laiden user uploaded file just sitting there on your server shouldn't harm anything... as far as I know.

like image 38
bob-the-destroyer Avatar answered Jan 30 '23 14:01

bob-the-destroyer