I have 2 CSV files: 'Data' and 'Mapping':
Device_Name
, GDN
, Device_Type
, and Device_OS
. All four columns are populated.Device_Name
column populated and the other three columns blank. Device_Name
in the Data file, map its GDN
, Device_Type
, and Device_OS
value from the Mapping file.I know how to use dict when only 2 columns are present (1 is needed to be mapped) but I don't know how to accomplish this when 3 columns need to be mapped.
Following is the code using which I tried to accomplish mapping of Device_Type
:
x = dict([]) with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1: file_map = csv.reader(in_file1, delimiter=',') for row in file_map: typemap = [row[0],row[2]] x.append(typemap) with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file: writer = csv.writer(out_file, delimiter=',') for row in csv.reader(in_file2, delimiter=','): try: row[27] = x[row[11]] except KeyError: row[27] = "" writer.writerow(row)
It returns Attribute Error
.
After some researching, I think I need to create a nested dict, but I don't have any idea how to do this.
To create a nested dictionary, simply pass dictionary key:value pair as keyword arguments to dict() Constructor. You can use dict() function along with the zip() function, to combine separate lists of keys and values obtained dynamically at runtime.
Key Points to Remember: Nested dictionary is an unordered collection of dictionary. Slicing Nested Dictionary is not possible.
We have created a nested dictionary, which contains the empty data set, or an empty dictionary that does not contain any data item with their corresponding key values. Example 4: # Let us first create a normal dictionary that contains data items with their. # corresponding key value pairs in python programming language ...
A nested dict is a dictionary within a dictionary. A very simple thing.
>>> d = {} >>> d['dict1'] = {} >>> d['dict1']['innerkey'] = 'value' >>> d['dict1']['innerkey2'] = 'value2' >>> d {'dict1': {'innerkey': 'value', 'innerkey2': 'value2'}}
You can also use a defaultdict
from the collections
package to facilitate creating nested dictionaries.
>>> import collections >>> d = collections.defaultdict(dict) >>> d['dict1']['innerkey'] = 'value' >>> d # currently a defaultdict type defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}}) >>> dict(d) # but is exactly like a normal dictionary. {'dict1': {'innerkey': 'value'}}
You can populate that however you want.
I would recommend in your code something like the following:
d = {} # can use defaultdict(dict) instead for row in file_map: # derive row key from something # when using defaultdict, we can skip the next step creating a dictionary on row_key d[row_key] = {} for idx, col in enumerate(row): d[row_key][idx] = col
According to your comment:
may be above code is confusing the question. My problem in nutshell: I have 2 files a.csv b.csv, a.csv has 4 columns i j k l, b.csv also has these columns. i is kind of key columns for these csvs'. j k l column is empty in a.csv but populated in b.csv. I want to map values of j k l columns using 'i` as key column from b.csv to a.csv file
My suggestion would be something like this (without using defaultdict):
a_file = "path/to/a.csv" b_file = "path/to/b.csv" # read from file a.csv with open(a_file) as f: # skip headers f.next() # get first colum as keys keys = (line.split(',')[0] for line in f) # create empty dictionary: d = {} # read from file b.csv with open(b_file) as f: # gather headers except first key header headers = f.next().split(',')[1:] # iterate lines for line in f: # gather the colums cols = line.strip().split(',') # check to make sure this key should be mapped. if cols[0] not in keys: continue # add key to dict d[cols[0]] = dict( # inner keys are the header names, values are columns (headers[idx], v) for idx, v in enumerate(cols[1:]))
Please note though, that for parsing csv files there is a csv module.
UPDATE: For an arbitrary length of a nested dictionary, go to this answer.
Use the defaultdict function from the collections.
High performance: "if key not in dict" is very expensive when the data set is large.
Low maintenance: make the code more readable and can be easily extended.
from collections import defaultdict target_dict = defaultdict(dict) target_dict[key1][key2] = val
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With