Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling high-performance Haskell code

I have some high-performance Haskell code - the inner loop compiles down to 6 assembly instructions. Modifying the inner loop to be less efficient doesn't have any noticable impact on the performance, suggesting the inner loop is not the bottleneck. However, when I turn on profiling, the assembly code generated for the inner loop becomes dramatically less efficient, and the profiler reports that the inner loop takes 85% of the time.

I suspect something is unnecessarily slow, but when I use profiling to see what, I suspect that profiling makes the inner loop sufficiently slow that it dominates. What techniques can I use to see where the time is going? A sampling profiler would be great, if one existed for Haskell.

like image 223
Neil Mitchell Avatar asked Jan 02 '14 10:01

Neil Mitchell


1 Answers

You can use Linux perf events: https://ghc.haskell.org/trac/ghc/wiki/Debugging/LowLevelProfiling/Perf

This will give you an output that looks like:

# Samples: 9161149923 # # Overhead  Command      Shared Object  Symbol # ........  .......  .................  ...... #     30.65%   queens  queens             [.] s1ql_info     18.67%   queens  queens             [.] s1qj_info     12.17%   queens  queens             [.] s1qi_info      9.94%   queens  queens             [.] s1o9_info      5.85%   queens  queens             [.] r1nI_info      5.33%   queens  queens             [.] s1sF_info      5.18%   queens  queens             [.] s1sG_info      3.69%   queens  queens             [.] s1oP_info      1.68%   queens  queens             [.] stg_upd_frame_info      0.88%   queens  queens             [.] stg_ap_2_upd_info      0.62%   queens  queens             [.] s1sE_info      0.56%   queens  [kernel]           [k] read_hpet      0.39%   queens  queens             [.] stg_ap_p_info      0.35%    :2030             f76beb  [.] 0x00000000f76beb      0.31%   queens  queens             [.] s1oD_info      0.28%  swapper  [kernel]           [k] mwait_idle_with_hints      0.25%   queens  queens             [.] __stg_gc_enter_1      0.23%   queens  queens             [.] evacuate      0.18%  swapper  [kernel]           [k] read_hpet      0.12%   queens  queens             [.] scavenge_block 

If you save the core as you compile you can map these symbols back to the functions in core.

A bit painful, but gives you more trustworthy results.

There is some work afoot to do this automatically.

like image 103
tibbe Avatar answered Sep 22 '22 23:09

tibbe