In this article I discuss three basic types of operations on finger trees:
- adding elements
- deconstructing trees into head and tail
- appending two finger trees
There are three kinds of finger trees:
F = <> (Empty)
= <a> (Single)
= <[1-4], F, [1-4]> (Deep)
The Empty finger tree is, well, empty. The Single finger tree stores exactly one item. The Deep finger tree consists of three parts. The middle part is another finger tree, so it is recursive. The first and last part are called the prefix and suffix respectively, and they contain so-called digits. An important invariant of both prefixes and suffixes in Deep finger trees is that they always contain between 1 and 4 digits, inclusive.
You can think of a finger tree as a structure that defines a sequence of elements of any type, say type T. The Empty tree represents the empty sequence, a Single tree represents one single element, and a Deep tree represents at least two elements in a sequence starting at its prefix, ranging over the middle tree, ending in its suffix.
The first simple operation we can perform on finger trees is adding elements to the right using the addRight operation, also called snoc (inverse of cons).
We start with an Empty finger tree <>. If we push an element onto its right side, we end up with a Single finger tree.
<>.addRight(a) = <a>
If we add yet another element on the right, it becomes a Deep finger tree.
<a>.addRight(b) = <[a], <>, [b]>
This Deep tree has an Empty tree in the middle, a prefix consisting of one digit (element a) and a suffix consisting of one digit (element b).
We add another three elements on the right.
<[a], <>, [b]>.addRight(c).addRight(d).addRight(e)
= <[a], <>, [b, c, d, e]>
The additional elements are simply appended to the suffix. Now, if we add another element on the right side, we cannot simply add it to the suffix as before because this would break our invariant that states that a suffix can have a maximum of 4 digits. Therefore, to add another element we have to split the suffix and move a part of it into the middle tree.
<[a], <>, [b, c, d, e]>.addRight(f)
= <[a], <(b, c, d)>, [e, f]>
In the middle of the resulting tree we have a Single finger tree that contains a single node consisting of three elements. This is another important characteristic of a finger tree: if digits in a Deep tree have type T, the elements and digits in the middle tree are of type Node<T>
We will represent nodes as elements between parentheses. There are two type of nodes: 2-nodes and 3-nodes. When splitting the suffix in the previous example, we could also have opted for a 2-node in the middle, leaving 3 digits in the suffix of the resulting Deep tree.
Adding to the left of a finger tree (addLeft or cons) is a mirror of the addRight operation. Starting with an Empty tree, and adding elements a..l we end up with the following finger tree:
<[l, k], <[(j, i, h), (g, f, e)], <>, [(d, c, b)]>, [a]>
When adding elements to a Deep tree, we have only have to protect the 1-4 digits invariant by making sure that we do not end up with 5 digits. When taking away elements from a Deep tree, we have to perform the opposite check and guard ourselves from ending up with an empty prefix or suffix. To see how this works, let's look at the leftHead and leftTail operation. leftHead is simple: take the leftmost item of the prefix. leftTail returns the same tree without the item returned by leftHead. leftTail is straightforward most of the time. Abstracting away the middle and the suffix, we have for example
<[l, k], M, sf>.leftHead() = l
<[l, k], M, sf>.leftTail() = <[k], M, sf>
Now if we want to take the leftTail again, then we would end up with an empty prefix. The trick here is to pull a digit from the middle tree M. But M could be Empty, so we are forced to check this as well.
Therefore, if M is Empty, then
<[k], <>, sf>.leftTail() = treeOfDigits(sf)
treeOfDigits is simply defined as the function that takes a number of digits and turns it into a finger tree (for example by consecutively applying addRight):
treeOfDigits([d]) = <d>
...
treeOfDigits([d1, d2, d3, d4]) = <[d1], <>, [d2, d3, d4]>
For example:
<[k], <>, [c, b, a]>.leftTail()
= treeOf(c, b, a)
= <[c], <>, [b, a]>
If M is not Empty then
<[k], M, sf>.leftTail()
=
If the type of digits in F is T, then the items in the middle tree M of F are of type Node
A concrete example of using leftTail with a non-Empty middle tree is
<[d], <(x, y, z)>, [c, b, a]>.leftTail()
= <(x, y, z).toDigits(),
= <[x, y, z], <>, [c, b, a]>
The above example also shows that you go from a Single tree to an Empty tree when taking the leftTail of the first. We also go from a Deep to an Single tree when taking the leftTail of the first.
<[a], <>, [b]>.leftTail() = <b>
<a>.leftTail() = <>
Finaly, for Empty trees, leftTail makes no sense and is undefined.
Appending two finger trees is not that difficult as well. Again, the cases involving Empty and Single are trivial.
<>.append(<>) = <>
<a>.append(<>) = <>.append(<a>) = <a>
<a>.append(<b>) = <[a], <>, [b]>
For Deep trees, we first observe that we can preserve the leftmost prefix and the rightmost suffix when appending.
We still need to determine M, which only depends on Ml, sfl, prr, and Mr. Again we have to think of our invariant (1-4 digits) and typing rule (adding to or removing digits from the middle tree involves constructing and deconstructing nodes respectively).
The paper by Hinze and Paterson suggests the following algorithm:
1) toNodes: make a list L of nodes based on the digits in sfl and prr
2) app: append Ml, L and Mr, possibly by recursively going back to step 1
To illustrate how this works, let's append the following finger trees:
<[0], <(1, 2, 3)>, [4, 5, 6, 7]>
and
<[a],<[(b, c, d)], <>, [(e, f, g), (h, i, j)]>, [k, l]>
First of all, the result will be in the form of <[0], M, [k, l]>. We start computing M by concatenating the "middle" digits:
L = toNodes(4, 5, 6, 7, a) = [(4, 5, 6), (7, a)]
L is a list of 2 digits: a 3-node and a 2-node. We could also take L = [(4, 5), (6, 7, a)] but this is not terribly important. What is important, is that this step reduces a list of possibly more than 4 digits of type T to a list of maximum 3 digits of type Node<T>
In general, toNodes takes a minimum of 2 digits and a maximum of 12 digits to compute a list of anything between 1 2-node and 4 3-nodes depending on its input, preferring smaller resulting list sizes over larger ones (or the use of 3-nodes over 2-nodes where possible).
toNodes(1, 2) = [(1, 2)]
toNodes(1, 2, 3) = [(1, 2, 3)]
toNodes(1, 2, 3, 4) = [(1, 2), (3, 4)]
...
toNodes(1, 2, 3, 4, 5, 6, 7, 8) = [(1, 2, 3), (4, 5, 6), (7, 8)]
...
toNodes(1..12) = [(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12)]
Continuing with the example, now that we have L for our M we call an internal append, called app, to append Ml, L, and Mr.
M = app(<(1, 2, 3)>, [(4, 5, 6), (7, a)], <[(b, c, d)], <>, [(e, f, g), (h, i, j)]>)
This almost looks like our regular append operation of two finger trees: we have two trees Ml and Mr but here we also have a list L of digits in the middle. Let's break it down into distinct cases again.
First, the base cases where we don't recursively call app:
app(<>, [d1, ..., dn], F') = F'.addLeft(dn)...addLeft(d1)
app(F', [d1, ..., dn], <>) = F'.addRight(d1)...addRight(dn)
app(<a>, [d1, ..., dn], F') = F'.addLeft(dn)...addLeft(d1).addLeft(a)
app(F', [d1, ..., dn], <a>) = F'.addRight(d1)...addRight(dn).addRight(a)
The only remaining case is when we have 2 Deep trees. We return a new Deep tree for which we recursively call app to construct its middle tree.
app(<pr1, M1, [sfd1, ..., sfdn]>, [d1, ..., dn], <[prd1, ..., prdn], M2, sf2>)
= <pr1, app(M1, toNodes(sfd1, ..., sfdn, d1, ..., dn, prd1, ..., prdn), M2), sf2)>
The recursion is guaranteed to end because at each step we deconstruct Deep trees, splitting off a prefix and a suffix and packing the remaining digits using toNodes. Eventually we will end up in a base case because our trees have finite height.
Going back to our example, we compute
M = app(<(1, 2, 3)>, [(4, 5, 6), (7, a)], <[(b, c, d)], <>, [(e, f, g), (h, i, j)]>)
= <[(b, c, d)], <>, [(e, f, g), (h, i, j)]>.addLeft((7, a)).addLeft((4, 5, 6)).addLeft((1, 2, 3))
= <[(7, a), (b, c, d)], <>, [(e, f, g), (h, i, j)]>.addLeft((4, 5, 6)).addLeft((1, 2, 3))
= <[(4, 5, 6), (7, a), (b, c, d)], <>, [(e, f, g), (h, i, j)]>.addLeft((1, 2, 3))
= <[(1, 2, 3), (4, 5, 6), (7, a), (b, c, d)], <>, [(e, f, g), (h, i, j)]>
Substituting this finger tree for M in <[0], M, [k, l]> gives us the final result for our example of append.
No comments:
Post a Comment