Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assigning rank to a list of tuples in haskell

How do I assign rank to a list of tuples in Haskell (taking into account ties)? Ideally a want a function that given a list of tuples, would return a list of tuples with ranks.

sample input (assume ordered in ascending order based on snd of each tuple):

results1 = [("a",12),("b",56),("c",61),("d",73),("e",75),("f",75),("g",81),("h",82),("i",91),("j",97)]

sample output:

[("a",1),("b",2),("c",3),("d",4),("e",5.5),("f",5.5),("g",7),("h",8),("i",9),("j",10)]

Notice "e" and "f" tied, so they get their ranks (5 and 6) added together and divided by 2. More generally, any n amount of ties for a specific range of ranks [i..j] will all receive the same rank of sum [i..j] / n.

NOTE: I have just started learning Haskell today (coming from Python & Java), so I would prefer if helpful hints would be offered rather than revealing the answer. Just enough to push me to the right solution. Thanks!

EDIT/PART 2 QUESTION: Ok, so thanks to jamshidh, chunksof50, and leftaroundabout I've come up with

sortStudents xs = sortBy (compare `on` snd) xs

prerankStudents xs = groupBy ((==) `on` (snd.fst)) (zip (sortStudents xs) [1..])

rankStudents xs = concat [ [if length ys > 1 then (a, fromIntegral (sum (map snd ys)) / fromIntegral (length ys)) else (a,fromIntegral c) | ((a,b),c) <- ys] | ys <- (prerankStudents . sortStudents) xs ]

I'm relatively happy with sortStudents and prerankStudents, however rankStudents feels a bit like I'm writing python again (list comprehension), although I'm not sure if in this case it is a good or bad thing. I tried implementing rankStudents recursively with case..of however smoothening out the errors seemed over my head. Here is the code if anyone is up for explaining to me in detail why it doesn't work.

rankStudents xs = let ss = prerankStudents xs
              rankStudents' ys = case ys of [] -> []
                            [((a,b),c)] -> [(a,c)]
                            (((a1,b1),c1):zs) -> [((fst.fst) tup, fromIntegral (sum (map snd ys)) / fromIntegral (length ys)) | tup <- ys]              
                            y:ys -> rankStudents' y ++ rankStudents' ys
              in rankStudents' ss   
like image 209
Jane Doe Avatar asked Jan 04 '14 00:01

Jane Doe


3 Answers

Here are some functions that you will find useful....

Data.List.groupBy
Data.List.sortBy --you won't actually need this if you assume the input is ordered, but I threw it in anyway
Data.Function.on
(==)

You can group the data by second term, then use recursion to output the values, incrementing the rank per item.... If the number of items in a group is greater than one, just increment by this value, and output the values with the average of the values acording to the rank in a group.

This is enough to get you going without giving the full answer.

like image 144
jamshidh Avatar answered Nov 02 '22 03:11

jamshidh


Hints and tips you asked for, so here goes with the tips first:

  1. Use hoogle to find out about functions.
  2. Use type signatures and pay attention to them.
  3. You'll learn more from learning haskell by doing things the pure functional way, so avoid the IO monad except for final plumbing, otherwise you'll end up writing like you used to.
  4. Recursion is your spanner, but higher order functions are your heavy machinery.
  5. Write lots of small functions that do something simple rather than one big function that solve

Hints

  • zip, zipWith - combine lists. You'll want the infinite list [1..] at some point.
  • groupBy. It's worth looking at the function on at some point while we're in this area, but that's one where unusually the type signature isn't much help and you want to see the examples.
  • sum, length, map. You're going to have to get creative with map; map is a fundamental list processing function. Use types to figure out which you might use out offst.snd or snd.fst or fst.fst or snd.snd
  • map some more and at the end you'll need to concat

You may end up defining additional functions, or go at it a very different way in the end, but learning about these will build your Haskell skills.

like image 32
not my job Avatar answered Nov 02 '22 05:11

not my job


If indeed the list os already sorted (if not, trivial to use sortBy comparing snd) then this is more or less replacing the snd field with a simple enumeration. So first do the enumeration! That's ridiculously simple in Haskell: zip results1 [1..] produces

[(("a",12),1),(("b",56),2),(("c",61),3),(("d",73),4),(("e",75),5),(("f",75),6),(("g",81),7),(("h",82),8),(("i",91),9),(("j",97),10)]

What's left to do is mainly equalising duplicates. I'd first group those: groupBy ((==)`on`(snd.snd)) gives

[[(("a",12),1)],[(("b",56),2)],[(("c",61),3)],[(("d",73),4)],[(("e",75),5),(("f",75),6)],[(("g",81),7)],[(("h",82),8)],[(("i",91),9)],[(("j",97),10)]]

Note how all elements are now in a singleton list each, except "e" and "f" share a list.

Then, for each of these lists, you average the rank. That can be written nicely as a list comprehension with pattern matching, a good thing you can do as an excercise.

like image 2
leftaroundabout Avatar answered Nov 02 '22 04:11

leftaroundabout