I'm using the praw reddit library to pull data from reddit and I ran into this bit of code that I don't understand why it returns any data (inside the class BaseReddit (full source):
def get_content(self, page_url, limit=0, url_data=None, place_holder=None,
root_field='data', thing_field='children',
after_field='after'):
"""A generator method to return reddit content from a URL. Starts at
the initial page_url, and fetches content using the `after` JSON data
until `limit` entries have been fetched, or the `place_holder` has been
reached.
:param page_url: the url to start fetching content from
:param limit: the maximum number of content entries to fetch. If
limit <= 0, fetch the default_content_limit for the site. If None,
then fetch unlimited entries--this would be used in conjunction
with the place_holder param.
:param url_data: dictionary containing extra GET data to put in the url
:param place_holder: if not None, the method will fetch `limit`
content, stopping if it finds content with `id` equal to
`place_holder`.
:param data_field: indicates the field in the json response that holds
the data. Most objects use 'data', however some (flairlist) don't
have the 'data' object. Use None for the root object.
:param thing_field: indicates the field under the data_field which
contains the list of things. Most objects use 'children'.
:param after_field: indicates the field which holds the after item
element
:type place_holder: a string corresponding to a reddit content id, e.g.
't3_asdfasdf'
:returns: a list of reddit content, of type Subreddit, Comment,
Submission or user flair.
"""
content_found = 0
if url_data is None:
url_data = {}
if limit is None:
fetch_all = True
elif limit <= 0:
fetch_all = False
limit = int(self.config.default_content_limit)
else:
fetch_all = False
# While we still need to fetch more content to reach our limit, do so.
while fetch_all or content_found < limit:
page_data = self.request_json(page_url, url_data=url_data)
if root_field:
root = page_data[root_field]
else:
root = page_data
for thing in root[thing_field]:
yield thing
content_found += 1
# Terminate when we reached the limit, or place holder
if (content_found == limit or
place_holder and thing.id == place_holder):
return
# Set/update the 'after' parameter for the next iteration
if after_field in root and root[after_field]:
url_data['after'] = root[after_field]
else:
return
It looks to me like all the return statements have no arguments and therefore would default to returning None
. Can someone explain this to me?
Note: Code is Python 2.x
It is a generator. See the yield
statement for a hint of that.
http://wiki.python.org/moin/Generators
This is a generator function, which you can tell by the yield
statement. The value is effectively 'returned' without actually returning from the function. When another value is requested from the function, the generator resumes from the point it yielded (as per the code below, continuing the for thing
loop...).
for thing in root[thing_field]:
yield thing
Simple example:
def blah():
for i in xrange(5):
yield i + 3
numbers = blah()
print next(numbers)
# lots of other code here...
# now we need the next value
print next(numbers)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With