Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there ever a good reason to use unsafePerformIO?

Tags:

haskell

The question says it all. More specifically, I am writing bindings to a C library, and I'm wondering what c functions I can use unsafePerformIO with. I assume using unsafePerformIO with anything involving pointers is a big no-no.

It would be great to see other cases where it is acceptable to use unsafePerformIO too.

like image 393
Vlad the Impala Avatar asked May 10 '12 07:05

Vlad the Impala


3 Answers

No need to involve C here. The unsafePerformIO function can be used in any situation where,

  1. You know that its use is safe, and

  2. You are unable to prove its safety using the Haskell type system.

For instance, you can make a memoize function using unsafePerformIO:

memoize :: Ord a => (a -> b) -> a -> b
memoize f = unsafePerformIO $ do
    memo <- newMVar $ Map.empty
    return $ \x -> unsafePerformIO $ modifyMVar memo $ \memov ->
        return $ case Map.lookup x memov of
            Just y -> (memov, y)
            Nothing -> let y = f x
                       in (Map.insert x y memov, y)

(This is off the top of my head, so I have no idea if there are flagrant errors in the code.)

The memoize function uses and modifies a memoization dictionary, but since the function as a whole is safe, you can give it a pure type (with no use of the IO monad). However, you have to use unsafePerformIO to do that.

Footnote: When it comes to the FFI, you are responsible for providing the types of the C functions to the Haskell system. You can achieve the effect of unsafePerformIO by simply omitting IO from the type. The FFI system is inherently unsafe, so using unsafePerformIO doesn't make much of a difference.

Footnote 2: There are often really subtle bugs in code that uses unsafePerformIO, the example is just a sketch of a possible use. In particular, unsafePerformIO can interact poorly with the optimizer.

like image 182
Dietrich Epp Avatar answered Oct 17 '22 19:10

Dietrich Epp


In the specific case of the FFI, unsafePerformIO is meant to be used for calling things that are mathematical functions, i.e. the output depends solely on the input parameters, and every time the function is called with the same inputs, it will return the same output. Also, the function shouldn't have side effects, such as modifying data on disk, or mutating memory.

Most functions from <math.h> could be called with unsafePerformIO, for example.

You're correct that unsafePerformIO and pointers don't usually mix. For example, suppose you have

p_sin(double *p) { return sin(*p); }

Even though you're just reading a value from a pointer, it's not safe to use unsafePerformIO. If you wrap p_sin, multiple calls can use the pointer argument, but get different results. It's necessary to keep the function in IO to ensure that it's sequenced properly in relation to pointer updates.

This example should make clear one reason why this is unsafe:

# file export.c

#include <math.h>
double p_sin(double *p) { return sin(*p); }

# file main.hs
{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign.Ptr
import Foreign.Marshal.Alloc
import Foreign.Storable

foreign import ccall "p_sin"
  p_sin :: Ptr Double -> Double

foreign import ccall "p_sin"
  safeSin :: Ptr Double -> IO Double

main :: IO ()
main = do
  p <- malloc
  let sin1  = p_sin p
      sin2  = safeSin p
  poke p 0
  putStrLn $ "unsafe: " ++ show sin1
  sin2 >>= \x -> putStrLn $ "safe: " ++ show x

  poke p 1
  putStrLn $ "unsafe: " ++ show sin1
  sin2 >>= \x -> putStrLn $ "safe: " ++ show x

When compiled, this program outputs

$ ./main 
unsafe: 0.0
safe: 0.0
unsafe: 0.0
safe: 0.8414709848078965

Even though the value referenced by the pointer has changed between the two references to "sin1", the expression isn't re-evaluated, leading to stale data being used. Since safeSin (and hence sin2) is in IO, the program is forced to re-evaluate the expression, so the updated pointer data is used instead.

like image 25
John L Avatar answered Oct 17 '22 18:10

John L


Obviously if it should never be used, it wouldn't be in the standard libraries. ;-)

There are a number of reasons why you might use it. Examples include:

  • Initialising global mutable state. (Whether you should ever have such a thing in the first place is a whole other discussion...)

  • Lazy I/O is implemented using this trick. (Again, whether lazy I/O is a good idea in the first place is debatable.)

  • The trace function uses it. (Yet again, it turns out trace is rather less useful than you might imagine.)

  • Perhaps most significantly, you can use it to implement data structures which are referentially transparent, but internally implemented using impure code. Often the ST monad will let you do that, but sometimes you need a little unsafePerformIO.

Lazy I/O can be seen as a special-case of the last point. So can memoisation.

Consider, for example, an "immutable", growable array. Internally you could implement that as a pure "handle" that points to a mutable array. The handle holds the user-visible size of the array, but the actual underlying mutable array is larger than that. When the user "appends" to the array, a new handle is returned, with a new, larger size, but the append is performed by mutating the underlying mutable array.

You can't do this with the ST monad. (Or rather, you can, but it still requires unsafePerformIO.)

Note that it's damned tricky to get this sort of thing right. And the type checker won't catch if it you're wrong. (That's what unsafePerformIO does; it makes the type checker not check that you're doing it correctly!) For example, if you append to an "old" handle, the correct thing to do would be to copy the underlying mutable array. Forget this, and your code will behave very strangely.

Now, to answer your real question: There's no particular reason why "anything without pointers" should be a no-no for unsafePerformIO. When asking whether to use this function or not, the only question of significance is this: Can the end-user observe any side-effects from doing this?

If the only thing it does is create some buffer somewhere that the user can't "see" from pure code, that's fine. If it writes to a file on disk... not so fine.

HTH.

like image 16
MathematicalOrchid Avatar answered Oct 17 '22 17:10

MathematicalOrchid