Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fixed strptime exception with thread lock, but slows down the program

I have the following code, which when is running inside of a thread (the full code is here - https://github.com/eWizardII/homobabel/blob/master/lovebird.py)

 for null in range(0,1):
            while True:
                try:
                    with open('C:/Twitter/tweets/user_0_' + str(self.id) + '.json', mode='w') as f:
                        f.write('[')
                        threadLock.acquire()
                        for i, seed in enumerate(Cursor(api.user_timeline,screen_name=self.ip).items(200)):
                            if i>0:
                                f.write(", ")
                            f.write("%s" % (json.dumps(dict(sc=seed.author.statuses_count))))
                            j = j + 1
                        threadLock.release()
                        f.write("]")
                except tweepy.TweepError, e:
                    with open('C:/Twitter/tweets/user_0_' + str(self.id) + '.json', mode='a') as f:
                        f.write("]")
                    print "ERROR on " + str(self.ip) + " Reason: ", e
                    with open('C:/Twitter/errors_0.txt', mode='a') as a_file:
                        new_ii = "ERROR on " + str(self.ip) + " Reason: " + str(e) + "\n"
                        a_file.write(new_ii)
                break

Now without the thread lock I generate the following error:

Exception in thread Thread-117: Traceback (most recent call last):   File "C:\Python27\lib\threading.py", line 530, in __bootstrap_inner
    self.run()   File "C:/Twitter/homobabel/lovebird.py", line 62, in run
    for i, seed in enumerate(Cursor(api.user_timeline,screen_name=self.ip).items(200)): File "build\bdist.win-amd64\egg\tweepy\cursor.py", line 110, in next
    self.current_page = self.page_iterator.next()   File "build\bdist.win-amd64\egg\tweepy\cursor.py", line 85, in next
    items = self.method(page=self.current_page,
*self.args, **self.kargs)   File "build\bdist.win-amd64\egg\tweepy\binder.py", line 196, in _call
    return method.execute()   File "build\bdist.win-amd64\egg\tweepy\binder.py", line 182, in execute
    result = self.api.parser.parse(self, resp.read())   File "build\bdist.win-amd64\egg\tweepy\parsers.py", line 75, in parse
    result = model.parse_list(method.api, json)   File "build\bdist.win-amd64\egg\tweepy\models.py", line 38, in parse_list
    results.append(cls.parse(api, obj))   File "build\bdist.win-amd64\egg\tweepy\models.py", line 49, in parse
    user = User.parse(api, v)   File "build\bdist.win-amd64\egg\tweepy\models.py", line 86, in parse
    setattr(user, k, parse_datetime(v))   File "build\bdist.win-amd64\egg\tweepy\utils.py", line 17, in parse_datetime
    date = datetime(*(time.strptime(string, '%a %b %d %H:%M:%S +0000 %Y')[0:6]))   File "C:\Python27\lib\_strptime.py", line 454, in _strptime_time
    return _strptime(data_string, format)[0]   File "C:\Python27\lib\_strptime.py", line 300, in _strptime
    _TimeRE_cache = TimeRE()   File "C:\Python27\lib\_strptime.py", line 188, in __init__
    self.locale_time = LocaleTime()   File "C:\Python27\lib\_strptime.py", line 77, in __init__
    raise ValueError("locale changed during initialization") ValueError: locale changed during initialization

The problem is with thread lock on, each thread runs itself serially basically, and it takes way to long for each loop to run for there to be any advantage to having a thread anymore. So if there isn't a way to get rid of the thread lock, is there a way to have it run the for loop inside of the try statement faster?

like image 379
eWizardII Avatar asked Feb 25 '23 10:02

eWizardII


1 Answers

According to a previous Answer on StackOverflow, time.strptime is not thread-safe. Unfortunately, the error referenced in that question is different than the error you're experiencing.

Their solution was to call time.strptime prior to initializing any threads, and then subsequent calls to time.strptime in various threads will work.

I think the same solution may work in your situation after reviewing the _strptime and locale standard library modules. I can't be certain it will work since I can't test your code locally, but I thought I'd provide you with a potential solution.

Let me know if this works.

Edit:

I've done a bit more research and the Python standard library is calling setlocale in the locale.h C header file. According to the setlocale documentation, this is not thread-safe and that calls to setlocale should occur before initializing threads as I mentioned previously.

Unfortunately, setlocale is called each time you call time.strptime. So, I suggest the following:

  1. Test out the solution laid out earlier, try calling time.strptime before initializing the threads and remove the locks.
  2. If #1 doesn't work, you'll probably need to roll your own time.strptime function that is thread-safe as mentioned in the Python documentation for the locale module.
like image 169
brildum Avatar answered Feb 28 '23 13:02

brildum