What is the best way to take a data file that contains a header row and read this row into a named tuple so that the data rows can be accessed by header name?
I was attempting something like this:
import csv
from collections import namedtuple
with open('data_file.txt', mode="r") as infile:
reader = csv.reader(infile)
Data = namedtuple("Data", ", ".join(i for i in reader[0]))
next(reader)
for row in reader:
data = Data(*row)
The reader object is not subscriptable, so the above code throws a TypeError
. What is the pythonic way to reader a file header into a namedtuple?
Step 1: In order to read rows in Python, First, we need to load the CSV file in one object. So to load the csv file into an object use open() method. Step 2: Create a reader object by passing the above-created file object to the reader function. Step 3: Use for loop on reader object to get each row.
Read A CSV File Using Python There are two common ways to read a . csv file when using Python. The first by using the csv library, and the second by using the pandas library.
Using len() function Under this method, we need to read the CSV file using pandas library and then use the len() function with the imported CSV file, which will return an int value of a number of lines/rows present in the CSV file.
Use:
Data = namedtuple("Data", next(reader))
and omit the line:
next(reader)
Combining this with an iterative version based on martineau's comment below, the example becomes for Python 2
import csv
from collections import namedtuple
from itertools import imap
with open("data_file.txt", mode="rb") as infile:
reader = csv.reader(infile)
Data = namedtuple("Data", next(reader)) # get names from column headers
for data in imap(Data._make, reader):
print data.foo
# ...further processing of a line...
and for Python 3
import csv
from collections import namedtuple
with open("data_file.txt", newline="") as infile:
reader = csv.reader(infile)
Data = namedtuple("Data", next(reader)) # get names from column headers
for data in map(Data._make, reader):
print(data.foo)
# ...further processing of a line...
Please have a look at csv.DictReader
. Basically, it provides the ability to get the column names from the first row as you're looking for and, after that, lets you access to each column in a row by name using a dictionary.
If for some reason you still need to access the rows as a collections.namedtuple
, it should be easy to transform the dictionaries to named tuples as follows:
with open('data_file.txt') as infile:
reader = csv.DictReader(infile)
Data = collections.namedtuple('Data', reader.fieldnames)
tuples = [Data(**row) for row in reader]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With