Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Global variable not recognized in functions

I know that this question looks exactly like so many other on here, because I just read them all and they all say to do what I already tried, and it hasn't worked (or I'm missing a subtle difference with my situation). Here is my situation:

I am writing a scraper using Scrapy and Python 2.7.11, and my code looks like this (this is a copy and paste with irrelevant lines omitted, but I can re-add them upon request):

class LbcSubtopicSpider(scrapy.Spider):

    ...omitted...

    rawTranscripts = []
    rawTranslations = []

    def parse(self, response):
        #global rawTranscripts, rawTranslations
        rawTitles = []
        rawVideos = []
        for sel in response.xpath('//ul[1]'): #only scrape the first list

        ...omitted...

            index = 0
            for sub in sel.xpath('li/ul/li/a'): #scrape the sublist items
                index += 1
                if index%2!=0: #odd numbered entries are the transcripts
                    transcriptLink = sub.xpath('@href').extract()
                    #url = response.urljoin(transcriptLink[0])
                    #yield scrapy.Request(url, callback=self.parse_transcript)
                else: #even numbered entries are the translations
                    translationLink = sub.xpath('@href').extract()
                    url = response.urljoin(translationLink[0])
                    yield scrapy.Request(url, callback=self.parse_translation)

        print rawTitles
        print rawVideos
        print rawTranslations

    def parse_translation(self, response):
        global rawTranslations
        for sel in response.xpath('//p[not(@class)]'):
            rawTranslation = sel.xpath('text()').extract()
            rawTranslations.append(rawTranslation)

This will return an error any time either "print rawTranslations" or "rawTranslations.append(rawTranslation)" is called because the global "rawTranslations" is not defined.

As I said before, I have looked into this pretty extensively and pretty much everyone on the internet says to just add a "global (name)" line to the beginning of any function you'd use/modify it in (although I'm not assigning to it ever, so I shouldn't even need this). I get the same result whether or not my global lines are commented out. This behavior seems to defy everything I've read about how globals work in Python, so I suspect this might be a Scrapy quirk related to how parse functions are called through scrapy.Request(....).

Apologies for posting what appears to be the same question you've seen so much yet again, but it seems to be a bit twisted this time around and hopefully someone can get to the bottom of it. Thanks.

like image 578
jah Avatar asked Jan 05 '23 18:01

jah


1 Answers

In your case the variable you want to access is not global, it is in the scope of the class.

global_var = "global"

class Example:

    class_var = "class"

    def __init__(self):
         self.instance_var = "instance"

    def check(self):
        print(instance_var) # error
        print(self.instance_var) # works
        print(class_var) # error
        print(self.class_var) # works, lookup goes "up" to the class
        print(global_var) # works
        print(self.global_var) # works not

You only need the global keyword if you want to write to a global variable. Hint: Don't do that because global variables that are written to bring nothing but pain and despair. Only use global variables as (config) constants.

global_var = "global"

class Example:

    def ex1(self):
        global_var = "local" # creates a new local variable named "global_var"

    def ex2(self):
        global global_var
        global_var = "local" # changes the global variable

Example().ex1()
print(global_var) # will still be "global"
Example().ex2()
print(global_var) # willnow be "local"
like image 136
syntonym Avatar answered Jan 18 '23 18:01

syntonym