# Extraction and Cost
In the previous sessions, we have seen examples of defining and analyzing syntactic terms in egglog.
After running the rewrite rules, the e-graph may contain a myriad of terms.
We often want to pick out one or a handful of terms for further processing.
Extraction is the process of picking out individual terms out of the many terms represented by an e-graph.
We have seen extract
command in the previous sessions, which allows us to extract the optimal term from the e-graph.
Optimality needs to be defined with regard to some cost model.
A cost model is a function that assigns a cost to each term in the e-graph.
By default, extract
uses AST size as its cost model and picks the term with the smallest cost.
In this session, we will show several ways of customizing the cost model in egglog.
Let's first see a simple example of setting costs with the :cost
attribute.
(push)
Here we have the same Expr language but annotated with :cost
attributes.
(datatype Expr
(Num i64)
(Var String)
(Add Expr Expr :cost 2)
(Mul Expr Expr :cost 10)
)
The default cost of a datatype constructor is 1.
Intuitively, the additional :cost
attributes mark the multiplication operation as more expensive than addition.
Let's look at how cost is computed for a concrete term in the default tree cost model.
; expr = x * 2 + 1
(let expr (Add (Mul (Var "x") (Num 2)) (Num 1)))
This term has a total cost of 18 because
(Add \\ cost = 2 (from Add) + 14 (from left operand) + 2 (from right operand) = 18
(Mul \\ cost = 10 (from Mul) + 2 (from left operand) + 2 (from right operand) = 14
(Var "x") \\ cost = 1 (from Var) + 1 (from "x") = 2
(Num 2) \\ cost = 1 (from Num) + 1 (from 2) = 2
)
(Num 1) \\ cost = 1 (from Num) + 1 (from 1) = 2
)
We can use the extract
command to extract the lowest cost variant of the term.
For now it gives the only version that we just defined
(extract expr)
Let's introduces more variants with rewrites
(rewrite (Mul x (Num 2)) (Add x x))
(run 1)
(extract expr)
It now extracts the lower cost variant that correspondes to x + x + 1
, which is equivalent to the original term.
If there are multiple variants of the same lowest cost, extract
break ties arbitrarily.
(pop)
Setting custom cost for e-nodes
The :cost
attribute sets an uniform additional cost to each appearance of corresponding constructor.
However, this is not expressive enough to cover the case where additional cost of an operation is not a fixed constant.
We can use the set-cost
feature provided by egglog-experimental
to get more fine-grained control of individual e-node's cost.
To show how this feature works, we define a toy language of matrices.
with-dynamic-cost
enables this feature for the constructors defined inside
(with-dynamic-cost
(datatype Matrix
; A matrix constant with fixed size
(MConst i64 i64)
; Matrix multiplication
(MMul Matrix Matrix)
)
)
We also define two analyses for the number of rows and columns
(function row (Matrix) i64 :no-merge)
(function col (Matrix) i64 :no-merge)
(rule (
(= x (MConst r c))
) (
(set (row x) r)
(set (col x) c)
))
(rule (
(= x (MMul y z))
(= r (row y))
(= (col y) (row z))
(= c (col z))
) (
(set (row x) r)
(set (col x) c)
))
Now we define the cost of matrix multiplication as a product of the dimensions
(rule (
(MMul y z)
(= r (row y))
(= m (col y))
(= c (col z))
) (
(set-cost (MMul y z) (* r (* m c)))
))
Let's optimize matrix multiplication with this cost model
(birewrite (MMul x (MMul y z)) (MMul (MMul x y) z))
(let Mexpr (MMul (MMul (MConst 64 8) (MConst 8 256)) (MConst 256 2)))
(run 5)
Thanks to our cost model, egglog is able to extract the equivalent program with lowest cost using the dimension information we provided:
; (MMul (MConst 64 8) (MMul (MConst 8 256) (MConst 256 2)))
(extract Mexpr)
For advanced users who want to further customize the cost model, it is possible to use define your own cost model in Rust
using the interface egglog provides.
Indeed, the set-cost
feature demonstrated here is implemented outside of the core egglog codebase and
uses the extensible cost model interface.
We will show how to implement a simple cost model in Rust later.