I'm working with datasets from two different webpages, but for the same individual - the data sets are legal info on. Some of the data is available on the first page, so I initialize a Defendant object with the proper info, and set the attributes that I don't currently have the data for to null
. This is the class:
class Defendant(object):
"""holds data for each individual defendant"""
def __init__(self,full_name,first_name,last_name,type_of_appeal,county,case_number,date_of_filing,
race,sex,dc_number,hair_color,eye_color,height,weight,birth_date,initial_receipt_date,current_facility,current_custody,current_release_date,link_to_page):
self.full_name = full_name
self.first_name = first_name
self.last_name = last_name
self.type_of_appeal = type_of_appeal
self.county = county
self.case_number = case_number
self.date_of_filing = date_of_filing
self.race = 'null'
self.sex = 'null'
self.dc_number = 'null'
self.hair_color = 'null'
self.eye_color = 'null'
self.height = 'null'
self.weight = 'null'
self.birth_date = 'null'
self.initial_receipt_date = 'null'
self.current_facility = 'null'
self.current_custody = 'null'
self.current_release_date = 'null'
self.link_to_page = link_to_page
And this is what it looks like when I add a half-filled out Defendant object to a list of defendants:
list_of_defendants.append(Defendant(name_final,'null','null',type_of_appeal_final,county_parsed_final,case_number,date_of_filing,'null','null','null','null','null','null','null','null','null','null','null','null',link_to_page))
then, when I get the rest of the data from the other page I update those attributes set to null like so:
for defendant in list_of_defendants:
defendant.sex = location_of_sex_on_page
defendant.first_name = location_of_first_name_on_page
## Etc.
My question is: is there a more pythonic way to either add attributes to a class or a less ugly way of initializing the class object when I only have half of the information that I want to store in it?
First, use default values for any arguments that you're setting to null. This way you don't even need to specify these arguments when instantiating the object (and you can specify any you do need in any order by using the argument name). You should use the Python value None
rather than the string "null"
for these, unless there is some specific reason for using the string. In Python 2.x, arguments with default values need to go last, so link_to_page
needs to be moved before these.
Then, you can set your attributes by updating the instance's __dict__
attribute, which stores the attributes attached to the instance. Each argument will be set as an attribute of the instance having the same name.
def __init__(self, full_name, first_name, last_name, type_of_appeal, county, case_number,
date_of_filing, link_to_page, race=None, sex=None, dc_number=None,
hair_color=None, eye_color=None, height=None, weight=None, birth_date=None,
initial_receipt_date=None, current_facility=None, current_custody=None,
current_release_date=None):
# set all arguments as attributes of this instance
code = self.__init__.__func__.func_code
argnames = code.co_varnames[1:code.co_argcount]
locs = locals()
self.__dict__.update((name, locs[name]) for name in argnames)
You might also consider synthesizing the full_name
from the two other name arguments. Then you don't have to pass in redundant information and it can never not match. You can do this on the fly via a property:
@property
def full_name(self):
return self.first_name + " " + self.last_name
For updating, I'd add a method to do that, but accept keyword-only arguments using **
. To help protect the integrity of the data, we will change only attributes that already exist and are set to None
.
def update(self, **kwargs):
self.__dict__.update((k, kwargs[k]) for k in kwargs
if self.__dict__.get(k, False) is None)
Then you can easily update all the ones you want with a single call:
defendant.update(eye_color="Brown", hair_color="Black", sex="Male")
To make sure an instance has been completely filled out, you can add a method or property that checks to make sure all attributes are not None
:
@property
def valid(self):
return all(self.__dict__[k] is not None for k in self.__dict__)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With