I just imported the values from a .csv file to a list of lists, and now I need to know how many distinct users are there. The file itself looks like to following:
[['123', 'apple'], ['123', 'banana'], ['345', 'apple'], ['567', 'berry'], ['567', 'banana']]
Basically, I need to know how many distinct users (first value in each sub-list is a user ID) are there (3
in this case, over 6,000 after doing some Excel filtering), and what are the frequencies for the food itself: {'apple': 2, 'banana': 2, 'berry': 1}
.
Here is the code I have tried to use for distinct values counts (using Python 2.7):
import csv
with open('food.csv', 'rb') as food:
next(food)
for line in food:
csv_food = csv.reader(food)
result_list = list(csv_follows)
result_distinct = list(x for l in result_list for x in l)
print len(result_distinct)
You can use [x[0] for x in result_list]
to get a list of all the ids. Then you create a set
, that is all list of all unique items in that list. The length of the set will then give you the number of unique users.
len(set([x[0] for x in result_list]))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With