Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AppEngine dev_appserver - urllib2.urlopen issue with localhost url

UPDATE

App Engine SDK 1.9.24 was released on July 20, 2015, so if you're still experiencing this, you should be able to fix this simply by updating. See +jpatokal's answer below for an explanation of the exact problem and solution.

Original Question

I have an application I'm working with and running into troubles when developing locally.

We have some shared code that checks an auth server for our apps using urllib2.urlopen. When I develop locally, I'm getting rejected with a 404 on my app that makes the request from AppEngine, but the request succeeds just fine from a terminal.

I have appengine running on port localhost:8000, and the auth server on localhost:8001

import urllib2

url = "http://localhost:8001/api/CheckAuthentication/?__client_id=dev&token=c7jl2y3smhzzqabhxnzrlyq5r5sdyjr8&username=amadison&__signature=6IXnj08bAnKoIBvJQUuBG8O1kBuBCWS8655s3DpBQIE="

try:
  r = urllib2.urlopen(url)
  print(r.geturl())
  print(r.read())
except urllib2.HTTPError as e:
  print("got error: {} - {}".format(e.code, e.reason))

which results in got error: 404 - Not Found from within AppEngine

It appears that AppEngine is adding the schema, host and port to the PATH portion of the url I'm trying to hit, as this is what I see on the auth server:

[02/Jul/2015 16:54:16] "GET http://localhost:8001/api/CheckAuthentication/?__client_id=dev&token=c7jl2y3smhzzqabhxnzrlyq5r5sdyjr8&username=amadison&__signature=6IXnj08bAnKoIBvJQUuBG8O1kBuBCWS8655s3DpBQIE= HTTP/1.1" 404 10146

and from the request header we can see the whole scheme and host and port are being passed along as part of the path (header pieces below):

 'HTTP_HOST': 'localhost:8001',
 'PATH_INFO': u'http://localhost:8001/api/CheckAuthentication/',
 'SERVER_PORT': '8001',
 'SERVER_PROTOCOL': 'HTTP/1.1',

Is there any way to not have the AppEngine Dev server hijack this request to localhost on a different port? Or am I not misunderstanding what is happening? Everything works fine in production where our domains are different.

Thanks in advance for any assistance helping to point me in the right direction.

like image 242
Aaron Avatar asked Jul 02 '15 22:07

Aaron


2 Answers

This is an annoying problem introduced by the urlfetch_stub implementation. I'm not sure what gcloud sdk version introduced it.

I've fixed this by patching the gcloud SDK - until Google does.

which means this answer will hopefully be irrelevant shortly

  1. Find and open urlfetch_stub.py, which can often be found at ~/google-cloud-sdk/platform/google_appengine/google/appengine/api/urlfetch_stub.py

  2. Around line 380 (depends on version), find:

full_path = urlparse.urlunsplit((protocol, host, path, query, ''))

and replace it with:

full_path = urlparse.urlunsplit(('', '', path, query, ''))

more info

You were correct in assuming the issue was a broken PATH_INFO header. The full_path here is being passed after the connection is made.

disclaimer

I may very easily have broken proxy requests with this patch. Because I expect google to fix it, I'm not going to go too crazy about it.

To be very clear this bug is ONLY related to LOCAL app development - you won't see this on production.

like image 83
Josh Avatar answered Oct 31 '22 05:10

Josh


App Engine SDK 1.9.24 was released on July 20, 2015, so if you're still experiencing this, you should be able to fix this simply by updating.

Here's a brief explanation of what happened. Until 1.9.21, the SDK was formatting URL fetch requests with relative paths, like this:

GET /test/ HTTP/1.1
Host: 127.0.0.1:5000

In 1.9.22, to better support proxies, this changed to absolute paths:

GET http://127.0.0.1:5000/test/ HTTP/1.1
Host: 127.0.0.1:5000

Both formats are perfectly legal per the HTTP/1.1 spec, see RFC 2616, section 5.1.2. However, while that spec dates to 1999, there are apparently quite a few HTTP request handlers that do not parse the absolute form correctly, instead just naively concatenating the path and the host together.

So in the interest of compatibility, the previous behavior has been restored. (Unless you're using a proxy, in which case the RFC requires absolute paths.)

like image 45
lambshaanxy Avatar answered Oct 31 '22 06:10

lambshaanxy