Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell type vs. newtype with respect to type safety [closed]

I know newtype is more often compared to data in Haskell, but I'm posing this comparison from more of a design point-of-view than as a technical problem.

In imperitive/OO languages, there is the anti-pattern "primitive obsession", where the prolific use of primitive types reduces the type-safety of a program and introduces accidentally interchangeability of same-typed values, otherwise intended for different purposes. For example, many things can be a String, but it would be nice if a compiler could know, statically, which we mean to be a name and which we mean to be the city in an address.

So, how often then, do Haskell programmers employ newtype to give type distinctions to otherwise primitive values? The use of type introduces an alias and gives a program's readability clearer semantics, but doesn't prevent accidentally interchanges of values. As I learn haskell I notice that the type system is as powerful as any I have come across. Therefore, I would think this is a natural and common practice, but I haven't seen much or any discussion of the use of newtype in this light.

Of course a lot of programmers do things differently, but is this at all common in haskell?

like image 568
StevenC Avatar asked Jun 13 '09 20:06

StevenC


People also ask

Is Haskell type safe?

Safe Haskell is an extension to the Haskell language that is implemented in GHC as of version 7.2. It allows for unsafe code to be securely included in a trusted code base by restricting the features of GHC Haskell the code is allowed to use. Put simply, it makes the types of programs trustable.

What is Newtype in Haskell?

In Haskell, the newtype declaration creates a new type from an existing one. For example, natural numbers can be represented by the type Integer using the following declaration: newtype Natural = MakeNatural Integer. This creates an entirely new type, Natural, whose only constructor contains a single Integer.

What is the difference between type and data in Haskell?

Type and data type refer to exactly the same concept. The Haskell keywords type and data are different, though: data allows you to introduce a new algebraic data type, while type just makes a type synonym. See the Haskell wiki for details.

Does Haskell have different types?

Everything in Haskell has a type, so the compiler can reason quite a lot about your program before compiling it. Unlike Java or Pascal, Haskell has type inference.


1 Answers

The main uses for newtypes are:

  1. For defining alternative instances for types.
  2. Documentation.
  3. Data/format correctness assurance.

I'm working on an application right now in which I use newtypes extensively. newtypes in Haskell are a purely compile-time concept. E.g. with unwrappers below, unFilename (Filename "x") compiled to the same code as "x". There is absolutely zero run-time hit. There is with data types. This makes it a very nice way to achieve the above listed goals.

-- | A file name (not a file path). newtype Filename = Filename { unFilename :: String }     deriving (Show,Eq) 

I don't want to accidentally treat this as a file path. It's not a file path. It's the name of a conceptual file somewhere in the database.

It's very important for algorithms to refer to the right thing, newtypes help with this. It's also very important for security, for example, consider upload of files to a web application. I have these types:

-- | A sanitized (safe) filename. newtype SanitizedFilename =    SanitizedFilename { unSafe :: String } deriving Show  -- | Unique, sanitized filename. newtype UniqueFilename =   UniqueFilename { unUnique :: SanitizedFilename } deriving Show  -- | An uploaded file. data File = File {    file_name     :: String         -- ^ Uploaded file.   ,file_location :: UniqueFilename -- ^ Saved location.   ,file_type     :: String         -- ^ File type.   } deriving (Show) 

Suppose I have this function which cleans a filename from a file that's been uploaded:

-- | Sanitize a filename for saving to upload directory. sanitizeFilename :: String            -- ^ Arbitrary filename.                  -> SanitizedFilename -- ^ Sanitized filename. sanitizeFilename = SanitizedFilename . filter ok where    ok c = isDigit c || isLetter c || elem c "-_." 

Now from that I generate a unique filename:

-- | Generate a unique filename. uniqueFilename :: SanitizedFilename -- ^ Sanitized filename.                -> IO UniqueFilename -- ^ Unique filename. 

It's dangerous to generate a unique filename from an arbitrary filename, it should be sanitized first. Likewise, a unique filename is thus always safe by extension. I can save the file to disk now and put that filename in my database if I want to.

But it can also be annoying to have to wrap/unwrap a lot. In the long run, I see it as worth it especially for avoiding value mismatches. ViewPatterns help somewhat:

-- | Get the form fields for a form. formFields :: ConferenceId -> Controller [Field] formFields (unConferenceId -> cid) = getFields where    ... code using cid .. 

Maybe you'll say that unwrapping it in a function is a problem -- what if you pass cid to a function wrongly? Not an issue, all functions using a conference id will use the ConferenceId type. What emerges is a sort of function-to-function-level contract system that is forced at compile time. Pretty nice. So yeah I use it as often as I can, especially in big systems.

like image 75
Christopher Done Avatar answered Sep 24 '22 04:09

Christopher Done