Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java - HttpUrlConnection returns cached response every time

I'm trying to gather statistical data from Roblox's currency exchange for analysis. Therefore, I need up-to-date data instead of a cached result. However, it seems that no matter what I do, the result is still cached. It seems that the most intuitive option, setUseCaches(), had no effect, and setting the header manually as Cache-Control: no-cache does not seem to work either. I inspected the Cache header using Fiddler2 and saw that its value was Cache-Control: max-age=0, but it didn't seem to change the program's behavior either. Here are the relevant pieces of code:

URL:

private final static String URL = "http://www.roblox.com/my/money.aspx#/#TradeCurrency_tab";

GET Request:

    URLConnection socket = new URL( URL ).openConnection( );
    socket.setUseCaches( false );
    socket.setDefaultUseCaches( false );
    HttpURLConnection conn = ( HttpURLConnection )socket;
    conn.setUseCaches( false );
    conn.setDefaultUseCaches( false );
    conn.setRequestProperty( "Pragma",  "no-cache" );
    conn.setRequestProperty( "Expires",  "0" );
    conn.setRequestProperty( "Cookie", ".ROBLOSECURITY=" + ROBLOSECURITY );
    conn.setRequestProperty( "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" );
    conn.setRequestProperty( "Accept-Language", "en-US,en;q=0.8" );
    conn.setRequestProperty( "User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36" );
    conn.setDoInput( true );
    conn.setRequestMethod( "GET" );
    conn.connect();

    Scanner data = new Scanner( conn.getInputStream() );
    data.useDelimiter( "\\A" );
    String result = data.next();

    data.close( );
    conn.disconnect();

It may or may not be important to note that it returns a unique result every time I restart the program but not during program runtime.

Update:

Wireshark analysis (I tweaked my code a bit since last time ):

GET /my/money.aspx HTTP/1.1
Pragma: no-cache
Expires: 0
Cookie: .ROBLOSECURITY=_|WARNING:-DO-NOT-SHARE-THIS.--Sharing-this-will-allow-someone-to-log-in-as-you-and-to-steal-your-ROBUX-and-items.|*sensitive*
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
Cache-Control: no-cache
Host: www.roblox.com
Connection: keep-alive

HTTP/1.1 200 OK
Cache-Control: private, s-maxage=0
Content-Type: text/html; charset=utf-8
Set-Cookie: rbx-ip=; domain=roblox.com; path=/; HttpOnly
Set-Cookie: RBXSource=rbx_acquisition_time=1/4/2016 12:45:21 AM&rbx_acquisition_referrer=&rbx_medium=Direct&rbx_source=&rbx_campaign=&rbx_adgroup=&rbx_keyword=&rbx_matchtype=&rbx_send_info=0; domain=roblox.com; expires=Wed, 03-Feb-2016 06:45:21 GMT; path=/
Access-Control-Allow-Credentials: true
Set-Cookie: rbx-ip=; domain=roblox.com; path=/; HttpOnly
Set-Cookie: RBXSource=rbx_acquisition_time=1/4/2016 12:45:21 AM&rbx_acquisition_referrer=&rbx_medium=Direct&rbx_source=&rbx_campaign=&rbx_adgroup=&rbx_keyword=&rbx_matchtype=&rbx_send_info=1; domain=roblox.com; expires=Wed, 03-Feb-2016 06:45:21 GMT; path=/
Set-Cookie: RBXEventTrackerV2=CreateDate=1/4/2016 12:45:21 AM&rbxid=59210735&browserid=3940274345; domain=roblox.com; expires=Fri, 22-May-2043 05:45:21 GMT; path=/
Set-Cookie: GuestData=UserID=-856460986; domain=.roblox.com; expires=Fri, 22-May-2043 05:45:21 GMT; path=/
P3P: CP="CAO DSP COR CURa ADMa DEVa OUR IND PHY ONL UNI COM NAV INT DEM PRE"
Date: Mon, 04 Jan 2016 06:45:20 GMT
Content-Length: 153751
like image 487
Christopher O'Toole Avatar asked Dec 30 '15 18:12

Christopher O'Toole


2 Answers

If the caching occurs server-side, append a cachebuster to the URL.

HttpURLConnection conn = ( HttpURLConnection )new URL( URL + "?_=" + System.currentTimeMillis() ).openConnection( );
like image 110
Nathan Dean Avatar answered Nov 19 '22 08:11

Nathan Dean


I notice you are not telling the local HttpURLConnection to bypass its own caches.

HttpURLConnection inherits the method setUseCaches(boolean) from URLConnection. From the Javadoc for setUseCaches(boolean)

Sets the value of the useCaches field of this URLConnection to the specified value.

Some protocols do caching of documents. Occasionally, it is important to be able to "tunnel through" and ignore the caches (e.g., the "reload" button in a browser). If the UseCaches flag on a connection is true, the connection is allowed to use whatever caches it can. If false, caches are to be ignored. The default value comes from DefaultUseCaches, which defaults to true.

like image 3
Jim Garrison Avatar answered Nov 19 '22 07:11

Jim Garrison