The search graph and caching
So now we have a good idea of what behavior we expect from cycles, or
at least inductive cycles (we'll talk about coinduction later). But how do we
actually implement this? That's where the SearchGraph
comes into play.
The role of the SearchGraph
is to store information about each goal that we
are currently solving. Typically, these are goals on the stack -- but other
times, they are goals that are no longer on the stack, but whose results
(because of a cycle) were dependent on something that is still on the stack.
We'll work through some examples to make it all clear.
Structure of the search graph
The search graph consists of nodes, each of which is assigned an index called a
DepthFirstNumber
. The name of this index alludes to the fact that, as we try
to prove a given goal, we are implicitly performing a "depth-first search" over
a graph of subgoals, and the index in the search graph is similar to a pre-order
index on the resulting tree.
Example search graph
Consider this example trait plus impls:
#![allow(unused)] fn main() { trait A { } impl<T: A, U: A> A for Result<T, U> { } impl A for u32 { } impl A for i32 { } impl A for f32 { } }
If we consider the full set of goals/subgoals that are involved in proving Implemented(Result<u32, i32>: A)
,
it would look like this:
graph TD G1["Implemented(Result<u32, i32>: A)<br>Pre-order number: 0<br>DepthFirstNumber: 0"] G2["Implemented(u32: A)<br>Pre-order number: 1<br>DepthFirstNumber: 1"] G3["Implemented(i32: A)<br>Pre-order number: 2<br>DepthFirstNumber: 1"] G1 --> G2 G1 --> G3
The graph also shows a possible set of pre-order numbers, as well as the
DepthFirstNumber
that would be used in the search graph. You can see that
they start to diverge. Pre-order numbers uniquely identify each goal in the
graph. In contrast, after we finish proving Implemented(u32: A)
, we remove
that node the graph, and hence its DepthFirstNumber
is re-used.
Goal lifecycle
Every goal that we prove in the recursive solver goes through the following states:
graph TD NewlyCreated["Newly created"] OnStack["On stack and in the search graph"] InGraph["Popped from stack but retained in search graph"] ProcessingComplete["Processing complete"] InCache["Stored in the cache"] NewlyCreated --> OnStack OnStack -- Explore all program clauses for the goal --> ProcessingComplete ProcessingComplete -- If node is a participant in a cycle --> InGraph InGraph -- On next iteration --> OnStack ProcessingComplete -- If not part of a cycle or when fixed-point is reached --> InCache
At first, we create the goal and push it onto the stack, and we also add it to the search graph. We then explore each of the relevant program clauses to try and find the solution(s) to the goal. Along the way we update the overall solution:
- If there are no valid solutions, then the result is an error.
- If there is exactly one solution, then we remember it as the unique solution.
- If there are multiple distinct solutions, the result is "ambiguous".
While we are doing this solving, we also track what other goals this goal winds up depending on. In particular, we are looking to see whether it winds up as a participant in a cycle -- that is, if it depends on any goals further up the goal stack.
If, when we're done with all program clauses, the goal never participated in any cycles, then we have reached our final solution. We can take that result and put it into the cache. The next time we look for a solution to this goal, we'll check that cache and return the result.
But otherwise, if the goal was a participant in a cycle, then we have to iterate, as described in the section on cycles. In that case, we keep the goal in the search graph, but we remove it from the stack. This allows the search graph to serve as a kind of "interim cache". If, as we continue to search through the other nodes that remain on the stack, we have to solve this same goal again, we will find it in the search cache and re-use the result.
For goals that are participants in a cycle, when the cycle reaches its fixed-point (i.e., the top-most node has stopped changing), we go through and take all the results for all the subgoals (which are still present in the search graph) and move them all into the "final cache".
In other words, any result that is present in the search graph can be considered an "interim cache", with a result that is still being determined and may be dependent on other goals on the stack. Once the goal is completely processed, it is moved to the cache field where others can use it.
Processing a single goal, a flow chart
Whenever we are asked to solve a goal, these are the steps we take:
graph TD GoalInGraph["Goal in search graph?"] FlagAsHead["If goal is on stack,<br>flag as head of cycle."] ReturnCurrentResult["Return result from<br>search graph."] PushOnStack["Push goal on stack,<br>add to the search graph with index `G`,<br>initial result is error"] ProcessEachClause["Process each program clause in turn,<br>computing result,<br>and tracking `Minimums`"] IsCycleParticipant["Is G dependent on<br>goal lower in stack?"] StoreInCache["Move results `G..` <br>from search graph to cache"] PopFromCacheNotGraph["Pop goal from stack<br>but leave in search graph"] CompareResult["Did result change from<br>what is stored in search graph?"] UpdateSearchGraph["Update stored result<br>in search graph"] ClearPreviousIteration["Clear search graph nodes `G+1..`<br>from previous iteration"] GoalInGraph -- Yes --> FlagAsHead FlagAsHead --> ReturnCurrentResult GoalInGraph -- No, not in the graph --> PushOnStack PushOnStack --> ProcessEachClause ProcessEachClause -- Is head of cycle --> CompareResult ProcessEachClause -- Not head of cycle --> IsCycleParticipant CompareResult -- No, fixed-point reached --> IsCycleParticipant CompareResult -- Yes, result changed --> UpdateSearchGraph UpdateSearchGraph --> ClearPreviousIteration ClearPreviousIteration --> ProcessEachClause IsCycleParticipant -- No --> StoreInCache IsCycleParticipant -- Yes --> PopFromCacheNotGraph
Starting to prove a goal
The first thing we do when proving some goal G is to check the search graph to see if there is already a node for this goal.
If there is a node for G
If there is a node for G, that indicates that there is some sort of cycle involved in the graph. For now, we will defer this case, and come back to it after we've explained what happens without cycles.
If there is no node for G: pushing a new goal onto the stack
If there is no node for G in the graph, then we have a new subgoal to add
to the graph. We will first push a new entry onto the stack, yielding some new
stack depth d
. Then we create a new Node
in the search graph. It will be
assigned the next available DepthFirstNumber
. The search graph node contains a field
stack_depth
that will be set to Some(d)
, where d
is the depth of the
node on the stack.
The search graph node also stores the "current solution" for the given goal. As described in the search on inductive cycles, this solution starts out as an error but may be gradually widened as we iterate, if we find solutions.
Tracking dependencies
The way that we track dependencies is through a structure called the
Minimums
. The name comes from the idea that it is tracking the minimum
DepthFirstNumber
of any goal whose result we depended on. The minimum for a
goal G1 starts out as G1, since its result depends on itself, but if it winds up
recursively processing some goal G2 that is on the stack, then the minimum will
be adjusted to G2.
Along with the interim solution, the search graph node for a given goal also
stores the Minimums
that resulted from computing that interim solution
(i.e., what goals did that solution depend on). If some goal G1 winds up
recursively invoking some goal G2 that is in the search graph but not present
on the stack, then we update the current Minimums
with the values stored in
the search graph.
Removing nodes from the graph
Once we complete the processing for a node, it needs to be removed from the
processing stack. But we wish to leave it in the graph if it is dependent on
something else that is already on the stack. We do that just by checking the
Minimums
value to see if it is less than the current goal.