Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse a String containing multipart/form-data request body in Java

Problem statement

I think the title says it all: I'm looking for the way to parse a String containing the body part of a multipart/form-data HTTP request. I.e. the contents of the string would look something like this:

--xyzseparator-blah
Content-Disposition: form-data; name="param1"

hello, world
--xyzseparator-blah
Content-Disposition: form-data; name="param2"

42
--xyzseparator-blah
Content-Disposition: form-data; name="param3"

blah, blah, blah
--xyzseparator-blah--

What I'm hoping to obtain, is a parameters map, or something similar, like this.

parameters.get("param1");    // returns "hello, world"
parameters.get("param2");    // returns "42"
parameters.get("param3");    // returns "blah, blah, blah"
parameters.keys();           // returns ["param1", "param2", "param3"]

Further criteria

  • It would be best if I don't have to supply the separator (i.e. xyzseparator-blah in this case), but I can live with it if I do have to.
  • I'm looking for a library based solution, possibly from a main stream library (like "Apache Commons" or something similar).
  • I want to avoid rolling my own solution, but at the current stage, I'm afraid I will have to. Reason: while the example above seems trivial to split/parse with some string manipulation, real multipart request bodies can have many more headers. Besides that, I do not want to re-invent (and much less re-test!) the wheel :)

Alternative solution

If there were a solution, which satisfies the above criteria, but whose input is an Apache HttpRequest, instead of a String, that would be acceptable too. (Basically I do receive an HttpRequest, but the in-house library I'm using is built such, that it extracts the body of this request as a String, and passes that to the class responsible for doing the parsing. However, if need be, I could also work directly on the HttpRequest.)

Related questions

No matter how I try to find an answer through Google, here on SO, and on other forums too, the solution seems to be always to use commons fileupload to go through the parts. E.g.: here, here, here, here, here... However, parseRequest method, used in that solution, expects a RequestContext, which I do not have (only HttpRequest).

The other way, also mentioned in some of the above answers, is getting the parameters from the HttpServletRequest (but again, I only have HttpRequest).

EDIT: In other words: I could include Commons Fileupload (I have access to it), but that would not help me, because I have an HttpRequest, and the Commons Fileupload needs RequestContext. (Unless there is an easy way to convert from HttpRequest to RequestContext, which I have overlooked.)

like image 668
Attilio Avatar asked Jan 23 '18 20:01

Attilio


People also ask

How do I send an HTTP multipart request in Java?

To instruct my browser to show me how my Java program should send the HTTP multipart request, I first create the following HTML code: And then point my browser to the HTML file. I then use the HTML form to upload a file and have Fiddler examine the HTTP multipart request.

How to parse a multipart form?

Parse a multipart/form-data request, which is usually generated from a HTML form submission. The parameters can include both text values as well as binary files. They can be distinguished from the presence of a filename attribute. body of the HTTP request. Must be raw or character vector.

What is a multipart/form data request?

A multipart/form-data request consists of a single body which contains one or more values plus meta-data, separated using a boundary string. This boundary string is chosen by the client (e.g. the browser) and specified in the Content-Type header of the HTTP request.

What are the form parameters in parse?

Parse a multipart/form-data request, which is usually generated from a HTML form submission. The parameters can include both text values as well as binary files. They can be distinguished from the presence of a filename attribute.


1 Answers

You can parse your String using Commons FileUpload by wrapping it in a class implementing 'org.apache.commons.fileupload.UploadContext', like below.

I recommend wrapping the HttpRequest in your proposed alternate solution instead though, for a couple of reasons. First, using a String means that the whole multipart POST body, including the file contents,needs to fit into memory. Wrapping the HttpRequest would allow you to stream it, with only a small buffer in memory at one time. Second, without the HttpRequest, you'll need to sniff out the multipart boundary, which would normally be in the 'Content-type' header (see RFC1867).

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import org.apache.commons.fileupload.FileItem;
import org.apache.commons.fileupload.FileItemFactory;
import org.apache.commons.fileupload.FileUpload;
import org.apache.commons.fileupload.disk.DiskFileItemFactory;

public class MultiPartStringParser implements org.apache.commons.fileupload.UploadContext {

    public static void main(String[] args) throws Exception {
        String s = new String(Files.readAllBytes(Paths.get(args[0])));
        MultiPartStringParser p = new MultiPartStringParser(s);
        for (String key : p.parameters.keySet()) {
            System.out.println(key + "=" + p.parameters.get(key));
        }
    }
    
    private String postBody;
    private String boundary;
    private Map<String, String> parameters = new HashMap<String, String>();
            
    public MultiPartStringParser(String postBody) throws Exception {
        this.postBody = postBody;
        // Sniff out the multpart boundary.
        this.boundary = postBody.substring(2, postBody.indexOf('\n')).trim();
        // Parse out the parameters.
        final FileItemFactory factory = new DiskFileItemFactory();
        FileUpload upload = new FileUpload(factory);
        List<FileItem> fileItems = upload.parseRequest(this);
        for (FileItem fileItem: fileItems) {
            if (fileItem.isFormField()){
                parameters.put(fileItem.getFieldName(), fileItem.getString());
            } // else it is an uploaded file
        }
    }
    
    public Map<String,String> getParameters() {
        return parameters;
    }

    // The methods below here are to implement the UploadContext interface.
    @Override
    public String getCharacterEncoding() {
        return "UTF-8"; // You should know the actual encoding.
    }
    
    // This is the deprecated method from RequestContext that unnecessarily
    // limits the length of the content to ~2GB by returning an int. 
    @Override
    public int getContentLength() {
        return -1; // Don't use this
    }

    @Override
    public String getContentType() {
        // Use the boundary that was sniffed out above.
        return "multipart/form-data, boundary=" + this.boundary;
    }

    @Override
    public InputStream getInputStream() throws IOException {
        return new ByteArrayInputStream(postBody.getBytes());
    }

    @Override
    public long contentLength() {
        return postBody.length();
    }
}
like image 189
roninjoe Avatar answered Oct 16 '22 21:10

roninjoe