Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to embed unit-handling logic in the OCaml type system?

This is probably impossible, but I feel like I might learn something, or there might be a wildly different way to approach it.

I'm writing some code that has some physics simulation elements, and I might be dealing in a bunch of different units. I feel like it's worth making the type system do some work for me on this, so that I can't, for instance, add mass to distance or something like that.

That would be easy enough:

module Mass : sig 
  type t
  val sum : t -> t -> t
  ...
end = struct
  type t = int
  let sum = +
  ...
end
module Distance : sig 
  type t
  val sum : t -> t -> t
  ...
end = struct
  type t = int
  let sum = +
  ...
end

And now the compiler will stop me from trying to mix the two (right?). Something similar should also work to express units of the same type (like, pounds vs. kilograms), and maybe even keep me safe from some precision or overflow errors. So far so easy. The interesting/difficult part is that I'd like to make a fluid framework for handling combinations of units, like meters per second squared, etc.

I can come kind of close by playing with functors:

module MakeProduct(F : UnitT)(S: UnitT) = struct
   type t = (F.t, S.t)
   ...
end
module MakeRatio(Numerator : UnitT)(Denominator: UnitT) = struct
   type t = (Numerator.t, Denominator.t)
   ...
end

and then I could just have

module MetersPerSecondSquared = MakeRatio(MakeRatio(Meter)( Second))(Second)

There would be some very clunky function names, but that should give me a type-safe system in which I can multiply 25 m/s^2 by 5s and get 125m/s.

The problem I see, other than clunkiness, is that the system won't be able to recognize types that express the same things in different order. For instance, I could also express the above as:

MakeRatio(Meter)(Product(Second)(Second))

Both should ultimately express the same concept, but I don't know of a way to tell the type system that these are the same, or that you should still be able to multiply the second by 5s and get a result in m/s.

What I'm trying to learn is:

  1. Is there a way to make this idea work after all?
  2. If not, is there a formal/theoretical reason this is hard? (just for my own education)
  3. Is there a completely different way to cleanly handle different units in the type system?
like image 200
Edward Peters Avatar asked Sep 13 '25 05:09

Edward Peters


1 Answers

It is possible to do some limited type-level arithmetic with the right encoding. However, any encoding will be limited by the fact that OCaml type system does not know about arithmetic, and cannot be tricked into proving complicated arithmetic theorem by itself.

One encoding that may strike a good compromise between complexity and features is to use a fixed set of core units (for instance m, s and kg) and a phantom type that describes the units of a float.

module Units: sig
  type 'a t
  val m: <m: ?one; s: ?zero; kg: ?zero> t
end = struct
  type 'a t = float
  let m = 1.
end

Here the type <m:'m; s:'s; kg:'kg> Units.t is essentially a float augmented with some a type parameter <m:'m; s:'s; kg:'kg> that describe the units exponents of each base units.

We want this exponent to be a type-level integer (so ?zero should be a type-level encoding for 0 etc...).

One useful encoding for integers is to encode them as translation rather on top of unary integer. First, we can define an unary z (for zero) type and a successor function at the type level:

type z = Zero
type 'a succ = S

Then we can encode zero as the function that maps integer n to n:

type 'n zero = 'n * 'n

and one as the function that maps integer n to n + 1:

type 'n one = 'n * 'n succ

With this encoding, we can fill out the ?zero and ?one placeholder in the Unit module:

module Unit: sig
  type +'a t

  (* Generators *)
  val m: <m:_ one; s:_ zero; kg:_ zero> t
  val s: <m:_ zero; s:_ one; kg:_ zero> t
  val kg: <m:_ zero; s:_ zero; kg:_ one> t
  ...
end

Then we can use our translation encoding to trick the typechecker into computing addition through type unification:

  val ( * ):
    <m:'m_low * 'm_mid; s:'s_low * 's_mid; kg:'kg_low * 'kg_mid>  t ->
    <m:'m_mid * 'm_high; s:'s_mid * 's_high; kg:'kg_mid * 'kg_high>  t ->
    <m:'m_low * 'm_high; s:'s_low * 's_high; kg:'kg_low * 'kg_high>  t

Here, if we look at what is happening on each component, we are essentially stating that if we have a translation from 'm_low to 'm_mid and another translation from 'm_mid to m_high, the sum of those two translations is the translation from 'm_low to 'm_high. Thus, we have implemented addition at the type-level.

Putting everything together, we end up with

module Unit: sig
  type +'a t

  (* Generators *)
  (* Floats are dimensionless *)
  val scalar: float -> <m:_ zero; s: _ zero; kg: _ zero> t
  val float: <m:_ zero; s: _ zero; kg: _ zero> t -> float
  (* Base units *)
  val m: <m:_ one; s:_ zero; kg:_ zero> t
  val s: <m:_ zero; s:_ one; kg:_ zero> t
  val kg: <m:_ zero; s:_ zero; kg:_ one> t

  (* Arithmetic operations *)
  val ( + ): 'a t -> 'a t -> 'a t
  val ( * ):
    <m:'m_low * 'm_mid; s:'s_low * 's_mid; kg:'kg_low * 'kg_mid>  t ->
    <m:'m_mid * 'm_high; s:'s_mid * 's_high; kg:'kg_mid * 'kg_high>  t ->
    <m:'m_low * 'm_high; s:'s_low * 's_high; kg:'kg_low * 'kg_high>  t

  val ( / ) :
    <m:'m_low * 'm_high; s:'s_low * 's_high; kg:'kg_low * 'kg_high> t ->
    <m:'m_mid * 'm_high; s:'s_mid * 's_high; kg:'kg_mid * 'kg_high> t ->
    <m:'m_low * 'm_mid ; s:'s_low * 's_mid ; kg:'kg_low * 'kg_mid > t
end = struct
 type +'a t = float
 let scalar x = x let float x = x
 let ( + ) = ( +. ) let ( * ) = ( *. ) let ( / ) = ( /. )
 let m = 1. let s = 1. let kg = 1.
end

Then we get the expected behavior: only values with the same dimension can added (or subtracted), multiplying values is done by adding the dimensional components (and the reverse for division). For instance, this code compiles correctly

open Units
let ( *. ) x y = scalar x * y
let au = 149_597_870_700. *. m
let c  = 299_792_458. *. m / s
let year = 86400. *. (365. *. s)
let ok = float @@ (c * year) / au

whereas trying to add an astronomical unit to a year yields a type error

let error = year + au

Error: This expression has type < kg : 'a * 'a; m : 'b * 'b succ; s : 'c * 'c > Unit.t but an expression was expected of type < kg : 'a * 'a; m : 'b * 'b; s : 'd * 'd succ > Unit.t The type variable 'b occurs inside 'b succ

However, the type error is not really understandable ... which is a common problem with using an encoding.

There is another important limitation with such encoding is that we are using the unification of type variable to do computation. By doing so, whenever the type variable has not been generalized, we are consuming it when doing the computation. This leads to strange errors. For instance, this function works fine

let strange_but_ok x y = m * x +  ((y/m) * m) * m

whereas this one does not typecheck

let strange_and_fail x = m * x +  ((x/m) * m) * m

Fortunately, since our phantom type parameter is covariant, the relaxed value restrictions will ensure that most of the time the type variables are generalized on time; and the issue will only manifest when mixing together function arguments of different dimensions.

Another important limitation of this encoding is that we are limited to additions, subtractions, multiplications and divisions of units. It is for instance impossible to compute a square root with this representation.

A way to lift this limitation is to still use a phantom type parameters for units, represent addition, multiplication and more using type constructors, and add some axiomatic equality between those type constructors. But then the user has to prove by hand the equivalence between different representations of the same integer.

like image 111
octachron Avatar answered Sep 15 '25 00:09

octachron