I can think of sorting them and then going over each element one by one but this is nlogn. Is there a linear method to count distinct elements in a list?
Update: - distinct vs. unique
If you are looking for "unique" values (As in if you see an element "JASON" more than once, than it is no longer unique and should not be counted)
You can do that in linear time by using a HashMap ;)
(The generalized / language-agnostic idea is Hash table)
Each entry of a HashMap / Hash table is <KEY, VALUE>
pair where the keys are unique (but no restrictions on their corresponding value in the pair)
Step 1:
Iterate through all elements in the list once: O(n)
Step2:
Iterate through the HashMap and count the KEYS with VALUE equal to exactly 1 (thus unique) O(n)
Analysis:
If, however, you are looking for "distinct" values (As in if you want to count how many different elements there are), use a HashSet instead of a HashMap / Hash table, and then simply query the size of the HashSet.
You can adapt this extremely cool O(n)-time and O(1)-space in-place algorithm for removing duplicates to the task of counting distinct values -- simply count the number of values equal to the sentinel value in a final O(n) pass, and subtract that from the size of the list.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With