Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Verifying a CSV file is really a CSV file

I want to make sure a CSV file uploaded by one of our clients is really a CSV file in PHP. I'm handling the upload itself just fine. I'm not worried about malicious users, but I am worried about the ones that will try to upload Excel workbooks instead. Unless I'm mistaken, an Excel workbook and a CSV can still have the same MIME, so checking that isn't good enough.

Is there one regular expression that can handle verifying a CSV file is really a CSV file? (I don't need parsing... that's what PHP's fgetcsv() is for.) I've seen several, but they are usually followed by comments like "it didn't work for case X."

Is there some other better way of handling this?

(I expect the CSV to hold first/last names, department names... nothing fancy.)

like image 998
Guttsy Avatar asked Sep 17 '10 21:09

Guttsy


1 Answers

Unlike other file formats, CSV has no tell-tale bytes in the file header. It starts straight away with the actual data.

I don't see any way except to actually parse it, and to count whether there is the expected number of columns in the result.

It may be enough to read as many characters as are needed to determine the first line (= until the first line break).

like image 91
Pekka Avatar answered Sep 27 '22 20:09

Pekka