# E-class analysis
In this lesson, we learn how to combine the power of equality saturation and Datalog. We will show how we can define program analyses using Datalog-style deductive reasoning, how EqSat-style rewrite rules can make the program analyses more accurate, and how accurate program analyses can enable more powerful rewrites.
Our first example will continue with the path
example in lesson 2.
In this case, there is a path from e1
to e2
if e1
is less than or equal to e2
.
(datatype Expr
; in this example we use big 🐀 to represent numbers
; you can find a list of primitive types in the standard library in `src/sort`
(Num BigRat)
(Var String)
(Add Expr Expr)
(Mul Expr Expr)
(Div Expr Expr))
Let's define some BigRat constants that will be useful later.
(let zero (bigrat (bigint 0) (bigint 1)))
(let one (bigrat (bigint 1) (bigint 1)))
(let two (bigrat (bigint 2) (bigint 1)))
We define a less-than-or-equal-to relation between two expressions.
(leq a b)
means that a <= b
for all possible values of variables.
(relation leq (Expr Expr))
We define rules to deduce the leq
relation. We start with transitivity of leq
:
(rule (
(leq e1 e2)
(leq e2 e3)
) (
(leq e1 e3)
))
We can define a few axioms for deciding when one expression is less than or equal to another.
Base case for leq
for Num
:
(rule (
(= e1 (Num n1))
(= e2 (Num n2))
(<= n1 n2)
) (
(leq e1 e2)
))
Base case for leq
for Var
:
(rule (
(= v (Var x))
) (
(leq v v)
))
Recursive case for leq
for Add
:
(rule (
(= e1 (Add e1a e1b))
(= e2 (Add e2a e2b))
(leq e1a e2a)
(leq e1b e2b)
) (
(leq e1 e2)
))
Note that we have not defined any rules for multiplication. This would require a more complex analysis on the positivity of the expressions.
On the other hand, these rules by themselves are pretty weak. For example, they cannot deduce x + 1 <= 2 + x
.
But EqSat-style axiomatic rules make these rules more powerful:
(birewrite (Add x (Add y z)) (Add (Add x y) z))
(birewrite (Mul x (Mul y z)) (Mul (Mul x y) z))
(rewrite (Add x y) (Add y x))
(rewrite (Mul x y) (Mul y x))
(rewrite (Mul x (Add y z)) (Add (Mul x y) (Mul x z)))
(rewrite (Add x (Num zero)) x)
(rewrite (Mul x (Num one)) x)
(rewrite (Add (Num a) (Num b)) (Num (+ a b)))
(rewrite (Mul (Num a) (Num b)) (Num (* a b)))
To check our rules
(let expr1 (Add (Var "y") (Add (Num two) (Var "x"))))
(let expr2 (Add (Add (Add (Var "x") (Var "y")) (Num one)) (Num two)))
(fail (check (leq e1 e2)))
(run-schedule (saturate (run)))
(check (leq e1 e2)) ; should pass
A useful special case of the leq analysis is if an expression is upper bounded or lower bounded by certain numbers, i.e., interval analysis:
(function upper-bound (Expr) BigRat :merge (min old new))
(function lower-bound (Expr) BigRat :merge (max old new))
In the above functions, unlike leq
, we define upper bound and lower bound as functions from
expressions to a unique number.
This is because we are always interested in the tightest upper bound
and lower bounds, so
(rule (
(leq e (Num n))
) (
(set (upper-bound e) n)
))
(rule (
(leq (Num n) e)
) (
(set (lower-bound e) n)
))
We can define more specific rules for obtaining the upper and lower bounds of an expression based on the upper and lower bounds of its children.
(rule (
(= e (Add e1 e2))
(= (upper-bound e1) u1)
(= (upper-bound e2) u2)
) (
(set (upper-bound e) (+ u1 u2))
))
(rule (
(= e (Add e1 e2))
(= (lower-bound e1) l1)
(= (lower-bound e2) l2)
) (
(set (lower-bound e) (+ l1 l2))
))
... and the giant rule for multiplication:
(rule (
(= e (Mul e1 e2))
(= le1 (lower-bound e1))
(= le2 (lower-bound e2))
(= ue1 (upper-bound e1))
(= ue2 (upper-bound e2))
) (
(set (lower-bound e)
(min (* le1 le2)
(min (* le1 ue2)
(min (* ue1 le2)
(* ue1 ue2)))))
(set (upper-bound e)
(max (* le1 le2)
(max (* le1 ue2)
(max (* ue1 le2)
(* ue1 ue2)))))))
Similarly,
(rule (
(= e (Mul x x))
) (
(set (lower-bound e) zero)
))
The interval analysis is not only useful for numerical tools like Herbie, but it can also guard certain optimization rules, making EqSat-based rewriting more powerful!
For example, we are interested in non-zero expressions
(relation non-zero (Expr))
(rule ((< (upper-bound e) zero)) ((non-zero e)))
(rule ((> (lower-bound e) zero)) ((non-zero e)))
(rewrite (Div x x) (Num one) :when ((non-zero x)))
(rewrite (Mul x (Div y x)) y :when ((non-zero x)))
This non-zero analysis lets us optimize expressions that contain division safely.
; 2 * (x / (1 + 2 / 2)) is equivalent to x
(let expr3 (Mul (Num two) (Div (Var "x") (Add (Num one) (Div (Num two) (Num two))))))
(let expr4 (Var "x"))
(fail (check (= expr3 expr4)))
(run-schedule (saturate (run)))
(check (= expr3 expr4))
; (x + 1)^2 + 2
(let expr5 (Add (Mul (Add (Var "x") (Num one)) (Add (Var "x") (Num one))) (Num two)))
(let expr6 (Div expr5 expr5))
(run-schedule (saturate (run)))
(check (= expr6 (Num one)))
Debugging tips!
print-size
is used to print the size of a table. If the table name is omitted, it prints the size of every table.
This is useful for debugging performance, by seeing how the table sizes evolve as the iteration count increases.
(print-size leq)
(print-size)
print-function
extracts every instance of a constructor, function, or relation in the e-graph.
It takes the maximum number of instances to extract as a second argument, so as not to spend time
printing millions of rows. print-function
is particularly useful when debugging small e-graphs.
(print-function leq 15)
extract
can also take a second argument, which causes it to extract that many different "variants" of the
first argument. This is useful when trying to figure out why one e-class is failing to be unioned with another.
(extract expr3 3)
Exercises:
(Free variable analysis)
One analysis that is frequently useful is free variable sets. While it is possible to simulate "sets" using
only functions, egglog provides containers to make this less tedious and more efficient. A container
is essentially a value that can store other values; some examples are Set
, Map
, and Vec
.
However, before we can construct a container, we have to tell egglog
what sort to use inside
the container. This is done with an overload of the sort
command.
(sort FreeVarSet (Set String))
Now, we can construct sets with set-of
.
(extract (set-of "1" "1"))
We will need a function to store the results of our free variable analysis. We have to use set intersection
for the merge function, because of rewrites like x / x => 1
.
(function FreeVars (Expr) FreeVarSet :merge (set-intersect old new))
Finally, we will need you to write the rules for the free variable analysis. You should have one rule for
every variant of Expr
. Here's an example rule for Add
:
(rule (
(= e (Add a b))
(= f (FreeVars a))
(= g (FreeVars b))
) (
(set (FreeVars e) (set-union f g))
))
If everything worked, expr5
should only have "x"
as a free variable.
(run-schedule (saturate (run)))
(extract (FreeVars expr3))