Count by pairs/Pivot table

Question

I've got the following data in a CSV format:

Date    Name    Color
12/11   Thomas  Blue
12/31   Andy    Black
12/21   Luise   Red
12/41   Mark    Blue
12/11   Ronda   Black
12/11   Thomas  Blue
12/21   Mark    Green
12/11   Ronda   Black
12/31   Luise   Red
12/41   Luise   Green

And I would like to create a count based on pairs, sort of pivot table out of it as below. Ideally as a CSV file as well

        Blue    Black   Red Green
Thomas   2          
Andy             1      
Luise                    2    1
Mark     1                    1
Ronda            1            1

I'm not entirely sure how to get around the problem. Can't use pandas either. :(

tobias_k · Accepted Answer

You could use a defaultdict of defaultdict of int to store the color counts.

import csv, collections

counts = collections.defaultdict(lambda: collections.defaultdict(int))
colors = set()
with open("data.csv") as f:
    reader = csv.reader(f, delimiter="	")
    next(reader) # skip first line
    for date, name, color in reader:
        counts[name][color] += 1
        colors.add(color)

Then, print the counts for the different colors (or write to CSV):

colors = list(colors)
print(colors)
for name in counts:
    print(name + "	" + "	".join(str(counts[name][color]) for color in colors))

Result (I'll leave the fine-tuning to you):

['Red', 'Blue', 'Green', 'Black']
Ronda   0   0   0   2
Thomas  0   2   0   0
Andy    0   0   0   1
Luise   2   0   1   0
Mark    0   1   1   0

Count by pairs/Pivot table

Tags:

python

python-3.x

csv

That guy

1 Answers

tobias_k

Recent Activity

Donate For Us

Count by pairs/Pivot table

Tags:

python

python-3.x

csv

That guy

1 Answers

tobias_k

Related questions

Recent Activity

Donate For Us