# Equality Saturation
In this tutorial, we will build an optimizer for a subset of linear algebra using egglog. We will start by optimizing simple integer arithmetic expressions. Our initial DSL supports constants, variables, addition, and multiplication.
(datatype Expr
(Num i64)
(Var String)
(Add Expr Expr)
(Mul Expr Expr))
Now, let's define some simple expressions.
; expr1 = 2 * (x * 3)
(let expr1 (Mul (Num 2) (Mul (Var "x") (Num 3))))
; expr2 = 6 * x
(let expr2 (Mul (Num 6) (Var "x")))
If you are running the program in the web demo, in the bottom right corner, you should see an e-graph with two expressions, each corresponding to the two expressions we defined above.
We can use the extract
command to print the results of expressions in egglog
..
(extract "Hello, world!") ; prints Hello, world!
(extract 42) ;; prints 42
(extract expr1) ;; prints (Mul (Num 2) (Add (Var "x") (Num 3)))
(extract expr2) ;; prints (Add (Num 6) (Mul (Num 2) (Var "x")))
The extract
command does more than printing. We will see the power of extract
once we get into program optimization.
We can use the check
commands to check properties of our e-graph.
(check (= expr1 (Mul x y)))
This checks if expr1
is equivalent to some expression (Mul x y)
, where x
and y
are
variables that can be mapped to any expression in the e-graph.
Checks can fail. For example the following check fails because expr1
is not equivalent to
(Add x y)
for any x
and y
in the e-graph.
; (check (= expr1 (Add x y)))
Let us define some rewrite rules over our small DSL.
(rewrite (Add x y) (Add y x))
This rule asserts that addition is commutative. More concretely, this rules says, if the e-graph
contains expressions of the form (Add x y)
, then the e-graph should also contain the
expression (Add y x)
, and they should be equivalent.
Similarly, we can define the associativity rule for addition.
(rewrite (Add x (Add y z)) (Add (Add x y) z))
This rule says, if the e-graph contains expressions of the form (Add x (Add y z))
, then the e-graph should also contain
the expression (Add (Add x y) z)
, and they should be equivalent.
There are two subtleties to rules in egglog
-
Defining a rule is different from running it. The following check would fail at this point because the commutativity rule has not been run.
(check (= expr1 (Mul (Num 2) (Add (Num 3) (Var "x")))))
-
Rules are not instantiated for every possible term; they are only instantiated for terms that are in the e-graph. For instance, even if we run the commutativity rule above, the following check would still fail because the e-graph does not contain the term
(Add (Num -2) (Num 2))
(or(Add (Num 2) (Num -2))
).(check (= (Add (Num -2) (Num 2)) (Add (Num 2) (Num -2))))
Let's also define commutativity and associativity for multiplication.
(rewrite (Mul x y) (Mul y x))
(rewrite (Mul x (Mul y z)) (Mul (Mul x y) z))
egglog
also defines a set of built-in functions over the primitive types, such as +
and *
.
(extract (+ 1 2))
egglog
supports operator overloading, so the same operator can be used with different types.
(extract (+ "1" "2"))
(extract (+ 1.0 2.0))
With primitives, we can define rewrite rules that talk about the semantics of operators. The following rules show constant folding over addition and multiplication.
(rewrite (Add (Num a) (Num b))
(Num (+ a b)))
(rewrite (Mul (Num a) (Num b))
(Num (* a b)))
While we have defined several rules, the e-graph has not changed since we inserted the two
expressions. To run rules we have defined so far, we can use the run
command.
(run 10)
This tells egglog
to run our rules for 10 iterations. More precisely, egglog runs the following pseudo code:
G = currentEgraph()
for i in 1..10:
for each rule r:
ms = r.find_matches(G)
for m in ms:
G = G.apply_rule(r, m)
G = rebuild(G)
In other words, egglog
computes all the matches for one iteration before making any
updates to the e-graph. This is in contrast to an evaluation model where rules are immediately
applied and the matches are obtained on demand over a changing e-graph.
If you are using the web demo, after running the rules, you should see the e-graph has grown
a little bit, and that (Mul (Num 2) (Add (Var "x") (Num 3)))
and (Add (Num 6) (Mul (Num 2)
(Var "x")))
are in the same E-class. In fact, we can check that
(check (= expr1 expr2))
Exercises
(Distributivity) Define the distributivity rule for multiplication over addition. When you're done, the following code should pass.
; expr3 = 2 * (x + 3)
(let expr3 (Mul (Num 2) (Add (Var "x") (Num 3))))
; expr4 = 6 + 2 * x
(let expr4 (Add (Num 6) (Mul (Num 2) (Var "x"))))
(fail (check (= expr3 expr4)))
(run 10)
(check (= expr3 expr4))
(birewrite) Egglog has the birewrite
keyword, which is similar to rewrite
, but it allows rewriting
in both directions. Try making some of the rules above into birewrite rules. Looking at the rules
we have defined, when is a birewrite rule useful and when it is not? When is it not
even well-defined?