Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How would you express this in Haskell?

Would you use if/else to write this algorithm in Haskell? Is there a way to express it without them? It's hard to extract functions out of the middle that have meaning. This is just the output of a machine learning system.

I'm implementing the algorithm for classifying segments of html content as Content or Boilerplate described here. This has the weights already hard coded.

curr_linkDensity <= 0.333333
| prev_linkDensity <= 0.555556
| | curr_numWords <= 16
| | | next_numWords <= 15
| | | | prev_numWords <= 4: BOILERPLATE
| | | | prev_numWords > 4: CONTENT
| | | next_numWords > 15: CONTENT
| | curr_numWords > 16: CONTENT
| prev_linkDensity > 0.555556
| | curr_numWords <= 40
| | | next_numWords <= 17: BOILERPLATE
| | | next_numWords > 17: CONTENT
| | curr_numWords > 40: CONTENT
curr_linkDensity > 0.333333: BOILERPLATE
like image 252
Sean Clark Hess Avatar asked Jul 14 '15 18:07

Sean Clark Hess


People also ask

What is an expression in Haskell?

An expression evaluates to a result (usually written (e rightsquigarrow r) but we'll use e -- > r ). Haskell uses a similar notation for numbers and operators as most languages: 2 -- > 2. 3+4 -- > 7. 3+4*5 {equivalent to 3+(4*5)} -- > 23.

How do you write or in Haskell?

This operator works in the same way as any other programming language, it just returns true or false based on the input we have provided. Also, we can use any number of or operators there is no such restriction for that. Or operator is represented by using the '||' double pipe symbol in Haskell.

What are functions in Haskell?

Functions play a major role in Haskell, as it is a functional programming language. Like other languages, Haskell does have its own functional definition and declaration. Function declaration consists of the function name and its argument list along with its output.


1 Answers

Not simplifying the logic manually (assuming you might generate this code automatically), I think using MultiWayIf is pretty clean and direct.

{-# LANGUAGE MultiWayIf #-}

data Stats = Stats {
    curr_linkDensity :: Double,
    prev_linkDensity :: Double,
    ...
}

data Classification = Content | Boilerplate

classify :: Stats -> Classification
classify s = if
    | curr_linkDensity s <= 0.333333 -> if
      | prev_linkDensity s <= 0.555556 -> if
        | curr_numWords s <= 16 -> if
          | next_numWords s <= 15 -> if
            | prev_numWords s <= 4 -> Boilerplate
            | prev_numWords s > 4 -> Content
          | next_numWords s > 16 -> Content
      ...

and so on.

However, since this is so structured -- just a tree of if/else with comparisons, also consider creating a decision tree data structure and writing an interpreter for it. This will allow you to do transformations, manipulations, inspections. Maybe it will buy you something; defining miniature languages for your specifications can be surprisingly beneficial.

data DecisionTree i o 
    = Comparison (i -> Double) Double (DecisionTree i o) (DecisionTree i o)
    | Leaf o

runDecisionTree :: DecisionTree i o -> i -> o
runDecisionTree (Comparison f v ifLess ifGreater) i
    | f i <= v  = runDecisionTree ifLess i
    | otherwise = runDecisionTree ifGreater i
runDecisionTree (Leaf o) = o

-- DecisionTree is an encoding of a function, and you can write
-- Functor, Applicative, and Monad instances!

Then

 classifier :: DecisionTree Stats Classification
 classifier =
     Comparison curr_linkDensity 0.333333
       (Comparison prev_linkDensity 0.555556
         (Comparison curr_numWords 16
           (Comparison next_numWords 15
             (Comparison prev_numWords 4
               (Leaf Boilerplate)
               (Leaf Content))
             (Leaf Content)
           ...
like image 191
luqui Avatar answered Nov 07 '22 00:11

luqui