I can access elements of a named tuple by name as follows(*): <pre class="prettyprint"><code>from collections import namedtuple Car = namedtuple('Car', 'color mileage') my_car = Car('red', 100) print my_car.color </code></pre> But how can I use a variable to specify the name of the field I want to access? E.g. <pre class="prettyprint"><code>field = 'color' my_car[field] # doesn't work my_car.field # doesn't work </code></pre> My actual use case is that I'm iterating through a pandas dataframe with <code>for row in data.itertuples()</code>. I am doing an operation on the value from a particular column, and I want to be able to specify the column to use by name as a parameter to the method containing this loop. (*) example taken from here. I am using Python 2.7.

You can use <code>getattr</code> <pre class="prettyprint"><code>getattr(my_car, field) </code></pre>

How to access a field of a namedtuple using a variable for the field name?

Tags:

namedtuple

I can access elements of a named tuple by name as follows(*):

from collections import namedtuple Car = namedtuple('Car', 'color mileage') my_car = Car('red', 100) print my_car.color

But how can I use a variable to specify the name of the field I want to access? E.g.

field = 'color' my_car[field] # doesn't work my_car.field # doesn't work

My actual use case is that I'm iterating through a pandas dataframe with for row in data.itertuples(). I am doing an operation on the value from a particular column, and I want to be able to specify the column to use by name as a parameter to the method containing this loop.

(*) example taken from here. I am using Python 2.7.

273

asked Jun 19 '17 15:06

LangeHaare

2 Answers

You can use getattr

getattr(my_car, field)

132

answered Oct 13 '22 05:10

juanpa.arrivillaga

The 'getattr' answer works, but there is another option which is slightly faster.

idx = {name: i for i, name in enumerate(list(df), start=1)} for row in df.itertuples(name=None):    example_value = row[idx['product_price']]

Explanation

Make a dictionary mapping the column names to the row position. Call 'itertuples' with "name=None". Then access the desired values in each tuple using the indexes obtained using the column name from the dictionary.

Make a dictionary to find the indexes.

idx = {name: i for i, name in enumerate(list(df), start=1)}

Use the dictionary to access the desired values by name in the row tuples

for row in df.itertuples(name=None):    example_value = row[idx['product_price']]

Note: Use start=0 in enumerate if you call itertuples with index=False

Here is a working example showing both methods and the timing of both methods.

import numpy as np import pandas as pd import timeit  data_length = 3 * 10**5 fake_data = {     "id_code": list(range(data_length)),     "letter_code": np.random.choice(list('abcdefgz'), size=data_length),     "pine_cones": np.random.randint(low=1, high=100, size=data_length),     "area": np.random.randint(low=1, high=100, size=data_length),     "temperature": np.random.randint(low=1, high=100, size=data_length),     "elevation": np.random.randint(low=1, high=100, size=data_length), } df = pd.DataFrame(fake_data)   def iter_with_idx():     result_data = []          idx = {name: i for i, name in enumerate(list(df), start=1)}          for row in df.itertuples(name=None):                  row_calc = row[idx['pine_cones']] / row[idx['area']]         result_data.append(row_calc)              return result_data         def iter_with_getaatr():          result_data = []     for row in df.itertuples():         row_calc = getattr(row, 'pine_cones') / getattr(row, 'area')         result_data.append(row_calc)              return result_data       dict_idx_method = timeit.timeit(iter_with_idx, number=100) get_attr_method = timeit.timeit(iter_with_getaatr, number=100)  print(f'Dictionary index Method {dict_idx_method:0.4f} seconds') print(f'Get attribute method {get_attr_method:0.4f} seconds')

Result:

Dictionary index Method 49.1814 seconds Get attribute method 80.1912 seconds

I assume the difference is due to lower overhead in creating a tuple vs a named tuple and also lower overhead in accessing it by the index rather than getattr but both of those are just guesses. If anyone knows better please comment.

I have not explored how the number of columns vs number of rows effects the timing results.

answered Oct 13 '22 05:10

Mint

Related questions
                            
                                IPython and Jupyter autocomplete not working
                            
                                Find size and free space of the filesystem containing a given file
                            
                                Comma separated lists in django templates
                            
                                Easier way to enable verbose logging
                            
                                MySQL parameterized queries
                            
                                PDF Parsing Using Python - extracting formatted and plain texts [closed]
                            
                                Python packages and egg-info directories
                            
                                What's the difference between Python's subprocess.call and subprocess.run
                            
                                Virtual environment in R?
                            
                                How do I count the letters in Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch?
                            
                                Python time.sleep() vs event.wait()
                            
                                How do I debug efficiently with Spyder in Python?
                            
                                regex error - nothing to repeat
                            
                                Python functools.wraps equivalent for classes
                            
                                Why does next raise a 'StopIteration', but 'for' do a normal return?
                            
                                Efficient thresholding filter of an array with numpy
                            
                                set environment variable in python script
                            
                                What is the difference between pickle and shelve?
                            
                                Opposite of melt in python pandas
                            
                                Running Python from Atom

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With