Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling a partially evaluated program

For the purposes of profiling a partially evaluated program, I'm interested in knowing the best way to terminate a GHC program. This is useful for profiling programs that take a long time to run, possibly as long as forever.

With GHC 7.4.2, I was able to profile a non-terminating program by enabling profiling (-prof -auto-all) and running my program with +RTS -p. This generated incremental profiling data. The program could be killed with ^c, and the .prof file would contain data. In GHC 7.6 and later, it appears that if the program can be terminated with a single ^c, then profiling information is written to output. However (especially with newer versions of GHC?) a single ^c doesn't kill the program, at least not before I get impatient and hit ^c again. Usually two ^c will kill the program, but then no profiling data is written to output.

Concretely, consider the problem of trying to profile StupidFib.hs:

fib n = fib (n - 1) + fib (n - 2)
main = print $ fib 100

Compiling with -prof and running with +RTS -p, I can kill this program with a single ^c in the first approximately 10 seconds of execution, but after that only two ^c will do the job. Looking at my resources, this change appears to coincide with the program using all of my physical memory and moving to swap space, however that could be coincidental.

Why does ^c work sometimes, but not other times for the same program? What is the easiest way to ensure that profiling data will get printed when the program does not terminate on its own?

like image 528
crockeea Avatar asked Nov 17 '14 23:11

crockeea


1 Answers

Most likely, the second signal is being delivered before the program has finished handling the first one, and at this point the signal's action has been reset to the default action, which (for SIGINT) is to terminate the program. Because of the swapping, there's a significant interval before the profiling code can write out the profiling data, during which time the program is vulnerable to a second SIGINT.

Moral of the story: be patient. If you wait long enough, the program will finish and the data will be written out. Regarding that second ^C, tell yourself, "Just don't do it!" :-)

One could argue that the Haskell runtime should set signal options such that a second SIGINT is ignored, but that would be risky because there'd be no easy way to terminate the program if things got really messed up trying to handle the signal.

You probably also want to avoid programs that exceed physical memory and induce a lot of swapping. At that point, your computation is effectively stalled and there's not much point in continuing. Use +RTS -M to limit the heap size to avoid getting into this situation.

like image 165
Neil Mayhew Avatar answered Nov 17 '22 17:11

Neil Mayhew