Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does app engine (python) manage memory across requests (Exceeded soft private memory limit)

I'm experiencing occasional Exceeded soft private memory limit error in a wide variety of request handlers in app engine. I understand that this error means that the RAM used by the instance has exceeded the amount allocated, and how that causes the instance to shut down.

I'd like to understand what might be the possible causes of the error, and to start, I'd like to understand how app engine python instances are expected to manage memory. My rudimentary assumptions were:

  1. An F2 instance starts with 256 MB
  2. When it starts up, it loads my application code - lets say 30 MB
  3. When it handles a request it has 226 MB available
    • so long as that request does not exceed 226 MB (+ margin of error) the request completes w/o error
    • if it does exceed 226 MB + margin, the instance completes the request, logs the 'Exceeded soft private memory limit' error, then terminates - now go back to step 1
  4. When that request returns, any memory used by it is freed up - ie. the unused RAM goes back to 226 MB
  5. Step 3-4 are repeated for each request passed to the instance, indefinitely

That's how I presumed it would work, but given that I'm occasionally seeing this error across a fairly wide set of request handlers, I'm now not so sure. My questions are:

a) Does step #4 happen?

b) What could cause it not to happen? or not to fully happen? e.g. how could memory leak between requests?

c) Could storage in module level variables causes memory usage to leak? (I'm not knowingly using module level variables in that way)

d) What tools / techniques can I use to get more data? E.g. measure memory usage at entry to request handler?

In answers/comments, where possible, please link to the gae documentation.

[edit] Extra info: my app is congifured as threadsafe: false. If this has a bearing on the answer, please state what it is. I plan to change to threadsafe: true soon.

[edit] Clarification: This question is about the expected behavior of gae for memory management. So while suggestions like 'call gc.collect()' might well be partial solutions to related problems, they don't fully answer this question. Up until the point that I understand how gae is expected to behave, using gc.collect() would feel like voodoo programming to me.

Finally: If I've got this all backwards then I apologize in advance - I really cant find much useful info on this, so I'm mostly guessing..

like image 678
tom Avatar asked Oct 06 '15 22:10

tom


1 Answers

App Engine's Python interpreter does nothing special, in terms of memory management, compared to any other standard Python interpreter. So, in particular, there is nothing special that happens "per request", such as your hypothetical step 4. Rather, as soon as any object's reference count decreases to zero, the Python interpreter reclaims that memory (module gc is only there to deal with garbage cycles -- when a bunch of objects never get their reference counts down to zero because they refer to each other even though there is no accessible external reference to them).

So, memory could easily "leak" (in practice, though technically it's not a leak) "between requests" if you use any global variable -- said variables will survive the instance of the handler class and its (e.g) get method -- i.e, your point (c), though you say you are not doing that.

Once you declare your module to be threadsafe, an instance may happen to serve multiple requests concurrently (up to what you've set as max_concurrent_requests in the automatic_scaling section of your module's .yaml configuration file; the default value is 8). So, your instance's RAM will need be a multiple of what each request needs.

As for (d), to "get more data" (I imagine you actually mean, get more RAM), the only thing you can do is configure a larger instance_class for your memory-hungry module.

To use less RAM, there are many techniques -- which have nothing to do with App Engine, everything to do with Python, and in particular, everything to do with your very specific code and its very specific needs.

The one GAE-specific issue I can think of is that ndb's caching has been reported to leak -- see https://code.google.com/p/googleappengine/issues/detail?id=9610 ; that thread also suggests workarounds, such as turning off ndb caching or moving to old db (which does no caching and has no leak). If you're using ndb and have not turned off its caching, that might be the root cause of "memory leak" problems you're observing.

like image 64
Alex Martelli Avatar answered Sep 28 '22 04:09

Alex Martelli