Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting fields from a data type in Haskell

(I am fairly new to Haskell) I made a data type like this -

data MatchingCondition = MatchingHead String | MatchingBody String | MatchingTail String

Now I want to write a function extractCondition :: MatchingCondition -> String which extracts the string out of this data type.

One way to do this is to explicitly write

extractCondition (MatchingHead x) = x
extractCondition (MatchingBody x) = x
extractCondition (MatchingTail x) = x

Right now, the cases were few so I could easily write down this function but it would become a huge pain if there were more cases or if in future I were to add more conditions to my data type. So is there any easy way to do this.

Besides, when my friend saw this code, he commented: this seems like it is defeating the type-safety introduced by the sum-type in the first place.

Can somebody explain what does this exactly mean?

like image 558
kishlaya Avatar asked Dec 01 '22 10:12

kishlaya


2 Answers

Would be a nice thing to be possible to write underscore for constructors:

extractCondition (_ x) = x

Unfortunately you cannot write such code in Haskell.

But you can always refactor your code to move tag into separate field:

data MatchingType = Head | Body | Tail
data MatchingCondition = Match MatchingType String

extractCondition (Match _ x) = x

There exist techniques where you just write something like extract @String myMatching, which will automatically return the String from every constructor. And you don't need to write any code at all! (see Scrap your boilerplate). Though it's probably not what you want.

The record-based solution in the other answer is also a valid solution. Though you should be careful with having records inside sum types. This can be dangerous!

Re type-safety defeat: better ask your friend since he said this. But it's hard to tell what he means without looking at the whole code.

like image 122
Shersh Avatar answered Dec 03 '22 22:12

Shersh


I'll only cover this point:

Besides, when my friend saw this code, he commented: this seems like it is defeating the type-safety introduced by the sum-type in the first place.

Can somebody explain what does this exactly mean?

Your friend is probably worried that in the future you will add some constructor without a String inside.

Consider:

data T = A String | B String

getString :: T -> String
getString (A s) = s
getString (B s) = s

There's nothing wrong with this, as long as T is set into stone.

If, instead, later on T is changed, we might end up with

data T = A String | B String | C NotAString

getString :: T -> String
getString (A s) = s
getString (B s) = s
getString (C n) = error "not a string, let's make the program crash!"

If T becomes extended in such way, getString becomes a partial function, which should be avoided. Having such a function around one is tempted to write code such as

foo :: T -> ...
foo t = use (getString t)

and "forget" about the non-string case, potentially crashing the program. If instead foo has to pattern match on all constructors, we would remember that case as well.

When T is extended, the type getString :: T -> String becomes a lie to the user. It is telling them that a String will always be produced but that's not the case. Typical solutions include removing getString and let foo do all the pattern matching, or, if "most" of the cases have a string, keep getString but change it to getString :: T -> Maybe String so that foo now will be forced to handle the "no string" case.

I'll also mention a common design error to avoid, which is a case of "boolean blindness". Some programmers are tempted to keep getString :: T -> String partial, and add a helper function

hasString :: T -> Bool

with the intention of using

foo t = if hasString t
   then use (getString t)
   else handleNoString

The issue here is that the used has to remember that each getString call must be guarded by hasString. The compiler does not help the programmer of that. This puts more burden on the programmer, who has to actively avoid dangerous cases. This issue would not be there if we used Maybe instead.

foo t = case getString t of
   Just s  -> use s     -- now use (getString t) would be a type error, and rightly so!
   Nothing -> handleNoString

This design is present in some Java library, which probably contributed to it becoming widespread. (I was told that many Java programmers now consider it a bad design.)

like image 43
chi Avatar answered Dec 04 '22 00:12

chi