Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a dictionary from a CSV file

I am trying to take input from a CSV file and then push it into a dictionary format (I am using Python 3.x).

I use the code below to read in the CSV file and that works:

import csv

reader = csv.reader(open('C:\\Users\\Chris\\Desktop\\test.csv'), delimiter=',', quotechar='|')

for row in reader:
    print(', '.join(row))

But now I want to place the results into a dictionary. I would like the first row of the CSV file to be used as the "key" field for the dictionary with the subsequent rows in the CSV file filling out the data portion.

Sample data:

     Date        First Name     Last Name     Score
12/28/2012 15:15        John          Smith        20
12/29/2012 15:15        Alex          Jones        38
12/30/2012 15:15      Michael       Carpenter      25

How can I get the dictionary to work?

like image 802
gakar06 Avatar asked Dec 30 '12 14:12

gakar06


People also ask

How do I convert a CSV file to a dictionary?

The best way to convert a CSV file to a Python dictionary is to create a CSV file object f using open("my_file. csv") and pass it in the csv. DictReader(f) method. The return value is an iterable of dictionaries, one per row in the CSV file, that maps the column header from the first row to the specific row value.

Can you turn a list into a dictionary?

To convert a list to a dictionary using the same values, you can use the dict. fromkeys() method. To convert two lists into one dictionary, you can use the Python zip() function. The dictionary comprehension lets you create a new dictionary based on the values of a list.

What is a csv dictionary?

CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record.


3 Answers

Create a dictionary, then iterate over the result and stuff the rows in the dictionary. Note that if you encounter a row with a duplicate date, you will have to decide what to do (raise an exception, replace the previous row, discard the later row, etc...)

Here's test.csv:

Date,Foo,Bar
123,456,789
abc,def,ghi

and the corresponding program:

import csv
reader = csv.reader(open('test.csv'))

result = {}
for row in reader:
    key = row[0]
    if key in result:
        # implement your duplicate row handling here
        pass
    result[key] = row[1:]
print(result)

yields:

{'Date': ['Foo', 'Bar'], '123': ['456', '789'], 'abc': ['def', 'ghi']}

or, with DictReader:

import csv
reader = csv.DictReader(open('test.csv'))

result = {}
for row in reader:
    key = row.pop('Date')
    if key in result:
        # implement your duplicate row handling here
        pass
    result[key] = row
print(result)

results in:

{'123': {'Foo': '456', 'Bar': '789'}, 'abc': {'Foo': 'def', 'Bar': 'ghi'}}

Or perhaps you want to map the column headings to a list of values for that column:

import csv
reader = csv.DictReader(open('test.csv'))

result = {}
for row in reader:
    for column, value in row.items():  # consider .iteritems() for Python 2
        result.setdefault(column, []).append(value)
print(result)

That yields:

{'Date': ['123', 'abc'], 'Foo': ['456', 'def'], 'Bar': ['789', 'ghi']}
like image 110
Phil Frost Avatar answered Sep 23 '22 02:09

Phil Frost


You need a Python DictReader class. More help can be found from here

import csv

with open('file_name.csv', 'rt') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print row
like image 12
Aamir Rind Avatar answered Sep 21 '22 02:09

Aamir Rind


Help from @phil-frost was very helpful, was exactly what I was looking for.

I have made few tweaks after that so I'm would like to share it here:

def csv_as_dict(file, ref_header, delimiter=None):

    import csv
    if not delimiter:
        delimiter = ';'
    reader = csv.DictReader(open(file), delimiter=delimiter)
    result = {}
    for row in reader:
        print(row)
        key = row.pop(ref_header)
        if key in result:
            # implement your duplicate row handling here
            pass
        result[key] = row
    return result

You can call it:

myvar = csv_as_dict(csv_file, 'ref_column')

Where ref_colum will be your main key for each row.

like image 2
Pablo Daniel Estigarribia Davy Avatar answered Sep 21 '22 02:09

Pablo Daniel Estigarribia Davy