Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

servlet file upload filename encoding

I am using the Apache Commons Fileupload tools for standard file upload. My problem is that I cannot get the proper filename of uploaded files if they contain special characters (á, é, ú, etc.) They all get converted to ? signs.

request.getCharacterEncoding() says UTF-8, but the bytes I get in the string fileItem.getName() are all the same for all my special characters.

Can you help me what's wrong?

(Some details: using Firefox 3.6.12, Weblogic 10.3 on Windows)

This is my code snippet:

 public CommandMsg(HttpServletRequest request) {
    Enumeration names = null;
    if (isMultipart(request)) {
      FileItemFactory factory = new DiskFileItemFactory();
      ServletFileUpload upload = new ServletFileUpload(factory);
      try {
        List uploadedItems = upload.parseRequest(request);
        Iterator i = uploadedItems.iterator();
        FileItem fileItem = null;
        while (i.hasNext()) {
          fileItem = (FileItem) i.next();
          if (fileItem.isFormField()) {
            // System.out.println("isFormField");
            setAttribute(fileItem.getFieldName(), fileItem.getString());
          } else {
            String enc = "utf-8";
            enc = request.getCharacterEncoding();
            String fileName = fileItem.getName();
            byte[] fnb = fileItem.getName().getBytes();
            byte[] fnb2 = null;
            try {
                fnb2 = fileItem.getName().getBytes(enc);
                String t1 = new String(fnb);
                String t2 = new String(fnb2);
                String t3 = new String(fnb, enc);
                String t4 = new String(fnb2, enc);
            } catch (UnsupportedEncodingException e) {
                e.printStackTrace();
            }
            setAttribute(fileItem.getFieldName(), fileItem);
          }
        }
      } catch (FileUploadException ex) {
        ex.printStackTrace();
      }

// etc..
like image 292
jabal Avatar asked Feb 16 '11 19:02

jabal


3 Answers

I had the same problem and solved it like this.

ServletFileUpload upload = new ServletFileUpload(factory);
upload.setHeaderEncoding("UTF-8"); 

FileItemIterator iter = upload.getItemIterator(request);
while (iter.hasNext()) {
    FileItemStream item = iter.next();
    String name = item.getFieldName();
    InputStream stream = item.openStream();
    if (item.isFormField()) {
        String value = Streams.asString(stream, "UTF-8");
    } 
}

If you based your code on the example provided in http://commons.apache.org/fileupload/streaming.html then you need to make sure you set UTF-8 in two places above.

like image 90
Christoph Avatar answered Oct 19 '22 23:10

Christoph


You need to ensure that the target console/file/database/whatever where you're printing/writing/inserting the file name to supports UTF-8 as well. The question marks indicate that it isn't configured to accept UTF-8 and that the target itself is aware of that. Otherwise you would just have seen mojibake.

Since the detail about the target is missing in the question, I can't do much more than suggesting to get yourself through this article to understand what's going on with characters behind the scenes.

like image 37
BalusC Avatar answered Oct 19 '22 22:10

BalusC


Solved the problem by calling ServletFileUpload instance's .setHeaderEncoding("ISO-8858-2") explicitly.

like image 40
jabal Avatar answered Oct 20 '22 00:10

jabal