I have a few Python scripts I have written for the Assessor's office where I work. Most of these ask for an input parcel ID number (this is then used to grab certain data through an odbc). They are not very consistent about how they input parcel ID's.
So here is my problem, they enter a parcel ID in one of 3 ways:
1: '1005191000060'
2: '001005191000060'
3: '0010-05-19-100-006-0'
The third way is the correct way, so I need to make sure the input is fixed to always match that format. Of course, they would rather type in the ID one of the first two ways. The parcel numbers must always be 15 digits long (20 with dashes)
I currently have a working method on how I fix the parcel ID, but it is very ugly. I am wondering if anyone knows a better way (or a more "Pythonic" way). I have a function that usually gets imported to all these scripts. Here is what I have:
import re
def FormatPID(in_pid):
    pid_format = re.compile('\d{4}-\d{2}-\d{2}-\d{3}-\d{3}-\d{1}')
    pid = in_pid.zfill(15) 
    if not pid_format.match(pid):
        fixed_pid = '-'.join([pid[:4],pid[4:6],pid[6:8],pid[8:11],pid[11:-1],pid[-1]])
        return fixed_pid
    else:
        return pid
if __name__ == '__main__':
    pid = '1005191000060'
##    pid = '001005191000060'
##    pid = '0010-05-19-100-006-0'
    # test
    t = FormatPID(pid)
    print t
This does work just fine, but I have been bothered by this ugly code for a while and I am thinking there has got to be a better way than slicing it. I am hoping there is a way I can "force" it to be converted to a string to match the "pid_format" variable. Any ideas? I couldn't find anything to do this in the regular expressions module
I wouldn't bother using regexes. You just want to get all the digits, ignoring hyphens, left-pad with 0s, then insert the hyphens in the right places, right? So:
def format_pid(pid):
    p = pid.replace('-', '')
    if not p.isdigit():
        raise ValueError('Invalid format: {}'.format(pid))
    p = p.zfill(15)
    # You can use your `join` call instead of the following if you prefer.
    # Or Ashwini's islice call.
    return '{}-{}-{}-{}-{}-{}'.format(p[:4], p[4:6], p[6:8], p[8:11], p[11:14], p[14:])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With