Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Specialization with Constraints

Tags:

haskell

ghc

I'm having problems getting GHC to specialize a function with a class constraint. I have a minimal example of my problem here: Foo.hs and Main.hs. The two files compile (GHC 7.6.2, ghc -O3 Main) and run.

NOTE: Foo.hs is really stripped down. If you want to see why the constraint is needed, you can see a little more code here. If I put the code in a single file or make many other minor changes, GHC simply inlines the call to plusFastCyc. This will not happen in the real code because plusFastCyc is too large for GHC to inline, even when marked INLINE. The point is to specialize the call to plusFastCyc, not inline it. plusFastCyc is called in many places in the real code, so duplicating such a large function would not be desirable even if I could force GHC to do it.

The code of interest is the plusFastCyc in Foo.hs, reproduced here:

{-# INLINEABLE plusFastCyc #-}
{-# SPECIALIZE plusFastCyc :: 
         forall m . (Factored m Int) => 
              (FastCyc (VT U.Vector m) Int) -> 
                   (FastCyc (VT U.Vector m) Int) -> 
                        (FastCyc (VT U.Vector m) Int) #-}

-- Although the next specialization makes `fcTest` fast,
-- it isn't useful to me in my real program because the phantom type M is reified
-- {-# SPECIALIZE plusFastCyc :: 
--          FastCyc (VT U.Vector M) Int -> 
--               FastCyc (VT U.Vector M) Int -> 
--                    FastCyc (VT U.Vector M) Int #-}

plusFastCyc :: (Num (t r)) => (FastCyc t r) -> (FastCyc t r) -> (FastCyc t r)
plusFastCyc (PowBasis v1) (PowBasis v2) = PowBasis $ v1 + v2

The Main.hs file has two drivers: vtTest, which runs in ~3 seconds, and fcTest, which runs in ~83 seconds when compiled with -O3 using the forall'd specialization.

The core shows that for the vtTest test, the addition code is being specialized to Unboxed vectors over Ints, etc, while generic vector code is used for fcTest. On line 10, you can see that GHC does write a specialized version of plusFastCyc, compared to the generic version on line 167. The rule for the specialization is on line 225. I believe this rule should fire on line 270. (main6 calls iterate main8 y, so main8 is where plusFastCyc should be specialized.)

My goal is to make fcTest as fast as vtTest by specializing plusFastCyc. I've found two ways to do this:

  1. Explicity call inline from GHC.Exts in fcTest.
  2. Remove the Factored m Int constraint on plusFastCyc.

Option 1 is unsatisfactory because in the actual code base plusFastCyc is a frequently used operation and a very large function, so it should not be inlined at every use. Rather, GHC should call a specialized version of plusFastCyc. Option 2 is not really an option because I need the constraint in the real code.

I've tried a variety of options using (and not using) INLINE, INLINABLE, and SPECIALIZE, but nothing seems to work. (EDIT: I may have stripped out too much of plusFastCyc to make my example small, so INLINE might cause the function to be inlined. This doesn't happen in my real code because plusFastCyc is so large.) In this particular example, I'm not getting any match_co: needs more cases or RULE: LHS too complicated to desugar (and here) warnings, though I was getting many match_co warnings before minimizing the example. Presumably, the "problem" is the Factored m Int constraint in the rule; if I make changes to that constraint, fcTest runs as fast as vtTest.

Am I doing something GHC just doesn't like? Why won't GHC specialize the plusFastCyc, and how can I make it?

UPDATE

The problem persists in GHC 7.8.2, so this question is still relevant.

like image 293
crockeea Avatar asked Jan 12 '14 05:01

crockeea


People also ask

What are the four types of specialization generalization?

Slide 4- 22 Hence, we have four types of specialization/generalization: Disjoint, total. Disjoint, partial. Overlapping, total.

What is specialization explain with example?

Specialization involves focusing on a specific skill, activity, or production process, such as a South American company harvesting bananas, to become the leader or expert.

Which is generalization constraint?

There are three types of constraints on generalization which are as follows: First one determines which entity can be a member of the low-level entity set. Second relates to whether or not entities belong to more than one lower-level entity set.

What is specialization in DBMS?

Specialization is a top-down approach in which a higher-level entity is divided into multiple specialized lower-level entities. In addition to sharing the attributes of the higher-level entity, these lower-level entities have specific attributes of their own.


1 Answers

GHC also gives an option to SPECIALIZE a type-class instance declaration. I tried this with the (expanded) code of Foo.hs, by putting the following:

instance (Num r, V.Vector v r, Factored m r) => Num (VT v m r) where 
    {-# SPECIALIZE instance ( Factored m Int => Num (VT U.Vector m Int)) #-}
    VT x + VT y = VT $ V.zipWith (+) x y

This change, though, did not achieve the desired speedup. What did achieve that performance improvement was manually adding a specialized instance for the type VT U.Vector m Int with the same function definitions, as follows:

instance (Factored m Int) => Num (VT U.Vector m Int) where 
    VT x + VT y = VT $ V.zipWith (+) x y

This requires adding OverlappingInstances and FlexibleInstances in LANGUAGE.

Interestingly, in the example program, the speedup obtained with the overlapping instance remains even if you remove every SPECIALIZE and INLINABLE pragma.

like image 198
Diego E. Alonso-Blas Avatar answered Oct 16 '22 12:10

Diego E. Alonso-Blas