Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count Distinct Values in a List of Lists - Python

I just imported the values from a .csv file to a list of lists, and now I need to know how many distinct users are there. The file itself looks like to following:

[['123', 'apple'], ['123', 'banana'], ['345', 'apple'], ['567', 'berry'], ['567', 'banana']]

Basically, I need to know how many distinct users (first value in each sub-list is a user ID) are there (3 in this case, over 6,000 after doing some Excel filtering), and what are the frequencies for the food itself: {'apple': 2, 'banana': 2, 'berry': 1}.

Here is the code I have tried to use for distinct values counts (using Python 2.7):

import csv
with open('food.csv', 'rb') as food:
    next(food)
    for line in food:
        csv_food = csv.reader(food)
        result_list = list(csv_follows)

result_distinct = list(x for l in result_list for x in l)

print len(result_distinct)
like image 636
Maiia S. Avatar asked Oct 29 '22 11:10

Maiia S.


1 Answers

You can use [x[0] for x in result_list] to get a list of all the ids. Then you create a set, that is all list of all unique items in that list. The length of the set will then give you the number of unique users.

len(set([x[0] for x in result_list]))
like image 176
blckbird Avatar answered Nov 09 '22 06:11

blckbird