# Equality Saturation

In this tutorial, we will build an optimizer for a subset of linear algebra using egglog. We will start by optimizing simple integer arithmetic expressions. Our initial DSL supports constants, variables, addition, and multiplication.

(datatype Expr
  (Num i64)
  (Var String)
  (Add Expr Expr)
  (Mul Expr Expr))

Now, let's define some simple expressions.

; expr1 = 2 * (x * 3)
(let expr1 (Mul (Num 2) (Mul (Var "x") (Num 3))))
; expr2 = 6 * x
(let expr2 (Mul (Num 6) (Var "x")))

If you are running the program in the web demo, in the bottom right corner, you should see an e-graph with two expressions, each corresponding to the two expressions we defined above.

We can use the extract command to print the results of expressions in egglog..

(extract "Hello, world!") ; prints Hello, world!
(extract 42) ;; prints 42
(extract expr1) ;; prints (Mul (Num 2) (Add (Var "x") (Num 3)))
(extract expr2) ;; prints (Add (Num 6) (Mul (Num 2) (Var "x")))

The extract command does more than printing. We will see the power of extract once we get into program optimization.

We can use the check commands to check properties of our e-graph.

(check (= expr1 (Mul x y)))

This checks if expr1 is equivalent to some expression (Mul x y), where x and y are variables that can be mapped to any expression in the e-graph.

Checks can fail. For example the following check fails because expr1 is not equivalent to (Add x y) for any x and y in the e-graph.

 ; (check (= expr1 (Add x y)))

Let us define some rewrite rules over our small DSL.

(rewrite (Add x y) (Add y x))

This rule asserts that addition is commutative. More concretely, this rules says, if the e-graph contains expressions of the form (Add x y), then the e-graph should also contain the expression (Add y x), and they should be equivalent.

Similarly, we can define the associativity rule for addition.

(rewrite (Add x (Add y z)) (Add (Add x y) z))

This rule says, if the e-graph contains expressions of the form (Add x (Add y z)), then the e-graph should also contain the expression (Add (Add x y) z), and they should be equivalent.

There are two subtleties to rules in egglog

  1. Defining a rule is different from running it. The following check would fail at this point because the commutativity rule has not been run.

    (check (= expr1 (Mul (Num 2) (Add (Num 3) (Var "x")))))

  2. Rules are not instantiated for every possible term; they are only instantiated for terms that are in the e-graph. For instance, even if we run the commutativity rule above, the following check would still fail because the e-graph does not contain the term (Add (Num -2) (Num 2)) (or (Add (Num 2) (Num -2))).

    (check (= (Add (Num -2) (Num 2)) (Add (Num 2) (Num -2))))

Let's also define commutativity and associativity for multiplication.

(rewrite (Mul x y) (Mul y x))
(rewrite (Mul x (Mul y z)) (Mul (Mul x y) z))

egglog also defines a set of built-in functions over the primitive types, such as + and *.

(extract (+ 1 2))

egglog supports operator overloading, so the same operator can be used with different types.

(extract (+ "1" "2"))
(extract (+ 1.0 2.0))

With primitives, we can define rewrite rules that talk about the semantics of operators. The following rules show constant folding over addition and multiplication.

(rewrite (Add (Num a) (Num b))
         (Num (+ a b)))
(rewrite (Mul (Num a) (Num b))
         (Num (* a b)))

While we have defined several rules, the e-graph has not changed since we inserted the two expressions. To run rules we have defined so far, we can use the run command.

(run 10)

This tells egglog to run our rules for 10 iterations. More precisely, egglog runs the following pseudo code:

 G = currentEgraph()
 for i in 1..10:
   for each rule r:
     ms = r.find_matches(G)
     for m in ms:
       G = G.apply_rule(r, m)
     G = rebuild(G)

In other words, egglog computes all the matches for one iteration before making any updates to the e-graph. This is in contrast to an evaluation model where rules are immediately applied and the matches are obtained on demand over a changing e-graph.

If you are using the web demo, after running the rules, you should see the e-graph has grown a little bit, and that (Mul (Num 2) (Add (Var "x") (Num 3))) and (Add (Num 6) (Mul (Num 2) (Var "x"))) are in the same E-class. In fact, we can check that

(check (= expr1 expr2))

Exercises

(Distributivity) Define the distributivity rule for multiplication over addition. When you're done, the following code should pass.

 ; expr3 = 2 * (x + 3)
 (let expr3 (Mul (Num 2) (Add (Var "x") (Num 3))))
 ; expr4 = 6 + 2 * x
 (let expr4 (Add (Num 6) (Mul (Num 2) (Var "x"))))
 (fail (check (= expr3 expr4)))
 (run 10)
 (check (= expr3 expr4))

(birewrite) Egglog has the birewrite keyword, which is similar to rewrite, but it allows rewriting in both directions. Try making some of the rules above into birewrite rules. Looking at the rules we have defined, when is a birewrite rule useful and when it is not? When is it not even well-defined?