Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Who is tampering with my data stream?

Tags:

java

sockets

ftp

The piece of code below downloads a file from some URL and saves it to a local file. Piece of cake. What could possible be wrong here?

protected long download(ProgressMonitor montitor) throws Exception{
    long size = 0;
    DataInputStream dis = new DataInputStream(is);
    int read = 0;
    byte[] chunk = new byte[chunkSize];
    while( (read = dis.read(chunk)) != -1){
        os.write(chunk, 0, read);
        size += read;
        if(montitor != null)
            montitor.worked(read);
    }

    chunk = null;
    dis.close();
    os.flush();
    os.close();
    return size;
}

The reason I am posting a question here is because it works in 99.999% of the time and doesn't work as expected whenever there is an antivirus or some other protection software installed on a computer running this code. I am blindly pointing a finger that way because whenever I stop (or disable) it, the code works perfect again. The end result of such interference is that the MD5 of downloaded file don't match the expected, and a whole new saga begins.

So, the question is - is it really possible that some smart "protection" software would alter the actual stream coming from the URL without me knowing about it? And if yes - how do you deal with this? (verified with Kasperksy and Norton products).


EDIT-1: Apparently I've got a hold on the problem and it's got nothing to do with antiviruses. The download takes place from the FTP server (FileZilla in particular) and we use apache commons ftp on client side . What I did is went to the FTP server and terminated the connection (kicked it out) in a middle of the download. I expected that is.read(..) would throw an IOException on client side, but this never happened. Instead, the is.read(..) returns -1 meaning that there is no more data coming from the stream. This is definitely unexpected and explains why sometimes I get partial files. This doesn't explain however why sometimes the data gets altered as well.

like image 894
Dima Avatar asked Nov 13 '22 00:11

Dima


1 Answers

Yeah this happens to me all the time. In my case it's caused by transparent HTTP proxying by Websense on my corporate network. The worst problem are caused by the block page being returned with 200 OK.

Do you get the same or similar corruption every time? E.g., do you get some HTML explaining why the request was blocked? The best you can probably do is compare the first few bytes of the downloaded data to some text in the block page, and throw an exception in this case.

Edit: based on your update, have you got the FTP client set to image/binary mode?

like image 121
artbristol Avatar answered Nov 15 '22 17:11

artbristol