I have a generated file with thousands of lines like the following:
CODE,XXX,DATE,20101201,TIME,070400,CONDITION_CODES,LTXT,PRICE,999.0000,QUANTITY,100,TSN,1510000001
Some lines have more fields and others have fewer, but all follow the same pattern of key-value pairs and each line has a TSN field.
When doing some analysis on the file, I wrote a loop like the following to read the file into a dictionary:
#!/usr/bin/env python
from sys import argv
records = {}
for line in open(argv[1]):
fields = line.strip().split(',')
record = dict(zip(fields[::2], fields[1::2]))
records[record['TSN']] = record
print 'Found %d records in the file.' % len(records)
...which is fine and does exactly what I want it to (the print
is just a trivial example).
However, it doesn't feel particularly "pythonic" to me and the line with:
dict(zip(fields[::2], fields[1::2]))
Which just feels "clunky" (how many times does it iterate over the fields?).
Is there a better way of doing this in Python 2.6 with just the standard modules to hand?
Method 1: Splitting a string to generate key:value pair of the dictionary In this approach, the given string will be analysed and with the use of split() method, the string will be split in such a way that it generates the key:value pair for the creation of a dictionary.
To convert a list to a dictionary using the same values, you can use the dict. fromkeys() method. To convert two lists into one dictionary, you can use the Python zip() function. The dictionary comprehension lets you create a new dictionary based on the values of a list.
Print a dictionary line by line using for loop & dict. items() dict. items() returns an iterable view object of the dictionary that we can use to iterate over the contents of the dictionary, i.e. key-value pairs in the dictionary and print them line by line i.e.
In Python 2 you could use izip
in the itertools
module and the magic of generator objects to write your own function to simplify the creation of pairs of values for the dict
records. I got the idea for pairwise()
from a similarly named (although functionally different) recipe in the Python 2 itertools
docs.
To use the approach in Python 3, you can just use plain zip()
since it does what izip()
did in Python 2 resulting in the latter's removal from itertools
— the example below addresses this and should work in both versions.
try:
from itertools import izip
except ImportError: # Python 3
izip = zip
def pairwise(iterable):
"s -> (s0,s1), (s2,s3), (s4, s5), ..."
a = iter(iterable)
return izip(a, a)
Which can be used like this in your file reading for
loop:
from sys import argv
records = {}
for line in open(argv[1]):
fields = (field.strip() for field in line.split(',')) # generator expr
record = dict(pairwise(fields))
records[record['TSN']] = record
print('Found %d records in the file.' % len(records))
But wait, there's more!
It's possible to create a generalized version I'll call grouper()
, which again corresponds to a similarly named itertools
recipe (which is listed right below pairwise()
):
def grouper(n, iterable):
"s -> (s0,s1,...sn-1), (sn,sn+1,...s2n-1), (s2n,s2n+1,...s3n-1), ..."
return izip(*[iter(iterable)]*n)
Which could be used like this in your for
loop:
record = dict(grouper(2, fields))
Of course, for specific cases like this, it's easy to use functools.partial()
and create a similar pairwise()
function with it (which will work in both Python 2 & 3):
import functools
pairwise = functools.partial(grouper, 2)
Postscript
Unless there's a really huge number of fields, you could instead create a actual sequence out of the pairs of line items (rather than using a generator expression which has no len()
):
fields = tuple(field.strip() for field in line.split(','))
The advantage being that it would allow the grouping to be done using simple slicing:
try:
xrange
except NameError: # Python 3
xrange = range
def grouper(n, sequence):
for i in xrange(0, len(sequence), n):
yield sequence[i:i+n]
pairwise = functools.partial(grouper, 2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With