Passing a collection argument without unpacking its contents

Tags:

Question: What are the pros and cons of writing an __init__ that takes a collection directly as an argument, rather than unpacking its contents?

Context: I'm writing a class to process data from several fields in a database table. I iterate through some large (~100 million rows) query result, passing one row at a time to a class that performs the processing. Each row is retrieved from the database as a tuple (or optionally, as a dictionary).

Discussion: Assume I'm interested in exactly three fields, but what gets passed into my class depends on the query, and the query is written by the user. The most basic approach might be one of the following:

class Direct:
    def __init__(self, names):
        self.names = names

class Simple:
    def __init__(self, names):
        self.name1 = names[0]
        self.name2 = names[1]
        self.name3 = names[2]

class Unpack:
    def __init__(self, names):
        self.name1, self.name2, self.name3 = names

Here are some examples of rows that might be passed to a new instance:

good = ('Simon', 'Marie', 'Kent')                 # Exactly what we want
bad1 = ('Simon', 'Marie', 'Kent', '10 Main St')   # Extra field(s) behind
bad2 = ('15', 'Simon', 'Marie', 'Kent')           # Extra field(s) in front
bad3 = ('Simon', 'Marie')                         # Forgot a field

When faced with the above, Direct always runs (at least to this point) but is very likely to be buggy (GIGO). It takes one argument and assigns it exactly as given, so this could be a tuple or list of any size, a Null value, a function reference, etc. This is the most quick-and-dirty way I can think of to initialize the object, but I feel like the class should complain immediately when I give it data it's clearly not designed to handle.

Simple handles bad1 correctly, is buggy when given bad2, and throws an error when given bad3. It's convenient to be able to effectively truncate the inputs from bad1 but not worth the bugs that would come from bad2. This one feels naive and inconsistent.

Unpack seems like the safest approach, because it throws an error in all three "bad" cases. The last thing we want to do is silently fill our database with bad information, right? It takes the tuple directly, but allows me to identify its contents as distinct attributes instead of forcing me to keep referring to indices, and complains if the tuple is the wrong size.

On the other hand, why pass a collection at all? Since I know I always want three fields, I can define __init__ to explicitly accept three arguments, and unpack the collection using the *-operator as I pass it to the new object:

class Explicit:
    def __init__(self, name1, name2, name3):
        self.name1 = name1
        self.name2 = name2
        self.name3 = name3

names = ('Guy', 'Rose', 'Deb')
e = Explicit(*names)

The only differences I see are that the __init__ definition is a bit more verbose and we raise TypeError instead of ValueError when the tuple is the wrong size. Philosophically, it seems to make sense that if we are taking some group of data (a row of a query) and examining its parts (three fields), we should pass a group of data (the tuple) but store its parts (the three attributes). So Unpack would be better.

If I wanted to accept an indeterminate number of fields, rather than always three, I still have the choice to pass the tuple directly or use arbitrary argument lists (*args, **kwargs) and *-operator unpacking. So I'm left wondering, is this a completely neutral style decision?

442

asked Jun 20 '13 20:06

Air

1 Answers

This question is probably best answered by trying out the different approaches and seeing what makes the most sense to you and is the most easily understood by others reading your code.

Now that I have the benefit of more experience, I'd ask myself, how do I plan to access these values?

When I access any one of the values in this collection, am I likely to be using most or all of the values in that same subroutine or section of code? If so, the "Direct" approach is a good choice; it's the most compact and it lets me think about the collection as a collection until the point that I absolutely need to pay attention to what's inside.

On the other hand, if I'm using some values here, some values there, I don't want have to constantly remember which index to access or add verbosity in the form of dictionary keys when I could just be referring directly to the values using separately named attributes. I would probably avoid the "Direct" approach in this case so that I only have to even think about the fact that there's a collection when the class is first initialized.

Each of the remaining approaches involves splitting the collection up into different attributes, and I think the clear winner here is the "Explicit" approach. The "Simple" and "Unpack" approaches share a hidden dependency on the order of the collection, without offering any real advantage.

answered Oct 18 '22 13:10

Air

Related questions
                            
                                Python send and receive RTP packets
                            
                                How to use django UserCreationForm correctly
                            
                                Intermittent "sslv3 alert handshake failure" under Python
                            
                                CTRL+C doesn't interrupt call to shared-library using CTYPES in Python
                            
                                Google App Engine Instances keep quickly shutting down
                            
                                Dynamically create a list of shared arrays using python multiprocessing
                            
                                PySerial - Full-duplex communication
                            
                                Deform/Colander validator that has access to all nodes?
                            
                                Optimizing product assembly / disassembly
                            
                                detect if a web page is changed
                            
                                How to silently uninstall Python 2.7 on Windows?
                            
                                Persistent in-memory Python object for nginx/uwsgi server
                            
                                SciPy deconvolution function
                            
                                Allowing multiple characters in morse code converter
                            
                                Development version on PyPI
                            
                                Python warn me or prevent me from using global variables
                            
                                How to prevent Django fixtures from conflicting with existing data
                            
                                Using Boto to tell when a file has successfully been uploaded to Glacier
                            
                                Unittest Tkinter File Dialog
                            
                                Using Python rdflib: how to include literals in sparql queries?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Passing a collection argument without unpacking its contents

Tags:

python

arguments

iterable-unpacking

Air

People also ask

1 Answers

Air

Recent Activity

Donate For Us