I am working on a project whose focus is the use of term rewriting to solve/simplify fixed-size bit-vector arithmetic problems, which is something useful to do as a prior step to some decision procedure such as those based on bit-blasting. The term rewriting may solve the problem at all, or otherwise produce a much simpler equivalent problem, so the combination of both can result in a considerable speed-up.
I am aware that many SMT solvers implement this strategy (e.g. Boolector, Beaver, Alt-Ergo or Z3), but it is being hard to find papers/tech-reports/etc in which these rewriting steps are described in detail. In general, I only found papers in which the authors describe such simplification steps in a few paragraphs. I would like to find some document explaining in detail the use of term rewriting: providing examples of rules, discussing the convenience of AC rewriting and/or other equational axioms, use of rewriting strategies, etc.
For now, I just have found the technical report A Decision Procedure for Fixed-Width Bit-Vectors which describes normalization/simplification steps performed by CVC Lite, and I would like to find more technical reports like this one! I have also found a poster about Term rewriting in STP but it is just a brief summary.
I have already visited the websites of those SMT solvers and I have searched in their Publications pages...
I would appreciate any reference, or any explanation of how term rewriting is being used in current versions of well-known SMT solvers. I am specially interested in Z3 because it looks to have one of the smartest simplification phases. For instance, Z3 3.* introduced a new bit-vector decision procedure but, unfortunately, I was not able to find any paper describing it.
I agree with you. It is hard to find papers describing the preprocessing steps used in modern SMT solvers.
Most SMT solver developers agree that these preprocessing steps are very important for the Bit-Vector theory.
I believe these techniques are not published for several reasons: most of them a little tricks that by themselves are not a significant scientific contribution; most of the techniques only work in the context of a particular system; a technique that may seem to work very well on solver A
, does not work on solver B
.
I believe that having open source SMT solvers is the only way to address this issue. Even if we publish the techniques used in a particular solver A
, it would be very hard to reproduce the actual behavior of solver A without seeing its source code.
Anyway, here is a summary of the preprocessing performed by Z3, and important details.
Several simplification rules, may reduce this size locally, but increase it globally. A simplifier must avoid the memory blowup caused by this kind of simplification. You can find examples at: http://research.microsoft.com/en-us/um/people/leonardo/mit2011.pdf
The first simplification step only performs local simplifications that preserve equivalence. Examples:
2*x - x -> x x and x -> x
t = v
where v
is a value. It replaces t
everywhere with v
.t = 0 and F[t] -> t = 0 and F[0]
Next, it performs Gaussian elimination for Bit-Vectors. However, only variables that occur at most twice in arithmetical expressions are eliminated.
This restriction is used to prevent an increase of the number of adders and multipliers in your formula.
For example, suppose we have x = y+z+w
and x
occurs at p(x+z)
, p(x+2*z)
, p(x+3*z)
and p(x+4*z)
. Then, after eliminating x
, we would have p(y+2*z+w)
, p(y+3*z+w)
, p(y+4*z+w)
and p(y+5*z+w)
. Although we eliminated x
, we have more adders than the original formula.
Next, it eliminates unconstrained variables. This approach is described by in the PhD thesis of Robert Brummayer and Roberto Brutomesso.
Then, another round of simplification is performed. This time, local contextual simplifications are performed. These simplifications are potentially very expensive. So, a threshold on the maximal number of nodes to be visited is used (the default value is 10million). Local context simplification contain rules such as
(x != 0 or y = x+1) -> (x != 0 or y = 1)
ab + ac -> (b+c)*a
Then, it tries to minimize the number of adders and multipliers by applying associativity and commutativity.
Suppose the formula contains the terms a + (b + c)
and a + (b + d)
. Then, Z3 will rewrite them to: (a+b)+c
and (a+b)+d
.
Before the transformation, Z3 would have to encode 4 adders. After, only three adders need to be encode since Z3 uses fully shared expressions.
If the formula contains only equality, concatenation, extraction and similar operators. Then, Z3 uses a specialized solver based on the union-find and congruence closure.
Otherwise, it bit-blasts everything, uses AIGs to minimize the Boolean formula, and invokes its SAT solver.
Z3 uses rewriting for many the simplification performed during pre-processing. Many of the rewrite rules used for the UFBV strategy (with quantifiers) are described to some detail in the following paper:
Wintersteiger, Hamadi, de Moura: Efficiently Solving Quantified Bit-Vector Formulas, FMCAD, 2010
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With