I try to write a program which will count the frequency of each element in a list.
In: "aabbcabb"
Out: [("a",3),("b",4),("c",1)]
You can view my code in the following link: http://codepad.org/nyIECIT2 In this code the output of unique function would be like this
In: "aabbcabb"
Out: "abc"
Using the output of unique we wil count the frequency of the target list. You can see the code here also:
frequencyOfElt xs=ans
where ans=countElt(unique xs) xs
unique []=[]
unique xs=(head xs):(unique (filter((/=)(head xs))xs))
countElt ref target=ans'
where ans'=zip ref lengths
lengths=map length $ zipWith($)(map[(=='a'),(==',b'),(==',c')](filter.(==))ref)(repeat target)
Error:Syntax error in input (unexpected symbol "unique")
But in ghci 6.13 other type of error are showing also
Few asked me what is the purpose of using [(=='a'),(==',b'),(==',c')]. What I expect: If ref="abc" and target="aabbaacc" then
zipWith($) (map filter ref)(repeat target)
will show ["aaaa","bb","cc"] then I can use map length over this to get the frequency Here for filtering list according with the ref i use [(=='a'),(==',b'),(==',c')]
I assume some logical error lies [(=='a'),(==',b'),(==',c')] here..
Using multiset-0.1:
import Data.Multiset
freq = toOccurList . fromList
You didn't say whether you want to write it whole on your own, or whether it's OK to compose it from some standard functions.
import Data.List
g s = map (\x -> ([head x], length x)) . group . sort $ s
-- g = map (head &&& length) . group . sort -- without the [...]
is the standard quick-n-dirty way to code it.
OK, so your original idea was to Code it Point-Free Style (certain tune playing in my head...):
frequencyOfElt :: (Eq a) => [a] -> [(a,Int)]
frequencyOfElt xs = countElt (unique xs) xs -- change the result type
where
unique [] = []
unique (x:xs) = x : unique (filter (/= x) xs)
countElt ref target = -- Code it Point-Free Style (your original idea)
zip
ref $ -- your original type would need (map (:[]) ref) here
map length $
zipWith ($) -- ((filter . (==)) c) === (filter (== c))
(zipWith ($) (repeat (filter . (==))) ref)
(repeat target)
I've changed the type here to the more reasonable [a] -> [(a,Int)]
btw. Note, that
zipWith ($) fs (repeat z) === map ($ z) fs
zipWith ($) (repeat f) zs === map (f $) zs === map f zs
hence the code simplifies to
countElt ref target =
zip
ref $
map length $
map ($ target)
(zipWith ($) (repeat (filter . (==))) ref)
and then
countElt ref target =
zip
ref $
map length $
map ($ target) $
map (filter . (==)) ref
but map f $ map g xs === map (f.g) xs
, so
countElt ref target =
zip
ref $
map (length . ($ target) . filter . (==)) ref -- (1)
which is a bit clearer (for my taste) written with a list comprehension,
countElt ref target =
[ (c, (length . ($ target) . filter . (==)) c) | c <- ref]
== [ (c, length ( ($ target) ( filter (== c)))) | c <- ref]
== [ (c, length $ filter (== c) target) | c <- ref]
Which gives us an idea to re-write (1) further as
countElt ref target =
zip <*> map (length . (`filter` target) . (==)) $ ref
but this obsession with point-free code becomes pointless here.
So going back to the readable list comprehensions, using a standard nub
function which is equivalent to your unique
, your idea becomes
import Data.List
frequencyOfElt xs = [ (c, length $ filter (== c) xs) | c <- nub xs]
This algorithm is actually quadratic (~ n^2
), so it is worse than the first version above which is dominated by sort
i.e. is linearithmic (~ n log(n)
).
This code though can be manipulated further by a principle of equivalent transformations:
= [ (c, length . filter (== c) $ sort xs) | c <- nub xs]
... because searching in a list is the same as searching in a list, sorted. Doing more work here -- will it pay off?..
= [ (c, length . filter (== c) $ sort xs) | (c:_) <- group $ sort xs]
... right? But now, group
had already grouped them by (==)
, so there's no need for the filter
call to repeat the work already done by group
:
= [ (c, length . get c . group $ sort xs) | (c:_) <- group $ sort xs]
where get c gs = fromJust . find ((== c).head) $ gs
= [ (c, length g) | g@(c:_) <- group $ sort xs]
= [ (head g, length g) | g <- group (sort xs)]
= (map (head &&& length) . group . sort) xs
isn't it? And here it is, the same linearithmic algorithm from the beginning of this post, actually derived from your code by factoring out its hidden common computations, making them available for reuse and code simplification.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With