Sections have now been added to the paper to incorporate explanations of our project and embed algorithms.
Version 0.7
We have implemented an embed function which converts d-dimensional n-branching automata into d-dimensional 2-branching automata. This may be useful to simplify deduction rules for CKY as well as creating a method of accessing rules in the automaton. We tested this using rec and project on automaton and trees from the Wrapping of Trees paper.
wrap1.aut
3 0 ~
who DPi ~
does I ~
does2 Ij ~
like V ~
Alice DP ~
Carol DP ~
think V ~
t2 Ij ~
t DPi ~
Cb Cb ~
Ib Ib ~
VP VP ~
that C ~
CP CP ~
IP IP ~
S S CP(~,DPi[1](Cb(~,C[1](Y(~,I[1](VP(~,V[1](DPi(~,~,~),~,~),~),~,~),~),~,~),~),~,~),~)
Cb Cb Cb(~,Ij[1](Y(~,Ij[1](VP(~,V[1](Cb(~,~,~),~,~),~),~,~),~),~,~),~)
Y Y IP(~,DP[1](Ib(~,~,~),~,~),~)
after using the embed() function:
wrap1.embed.aut
3 0 ~
does2 Ij ~
like V ~
that C ~
does I ~
IP IP ~
Alice DP ~
who DPi ~
Cb Cb ~
CP CP ~
Ib Ib ~
think V ~
Carol DP ~
t DPi ~
VP VP ~
t2 Ij ~
Cb' Cb' Cb[0](~,C[1](Y'[0](~,~,~),~,~),~)
VP' VP' VP[0](~,V[1](DPi[0](~,~,~),~,~),~)
Y Y IP[0](~,DP[1](Ib[0](~,~,~),~,~),~)
S S CP[0](~,DPi[1](Cb'[0](~,~,~),~,~),~)
VP' VP' VP[0](~,V[1](Cb[0](~,~,~),~,~),~)
Y' Y' Y[0](~,Ij[1](VP'[0](~,~,~),~,~),~)
Cb Cb Cb[0](~,Ij[1](Y'[0](~,~,~),~,~),~)
Y' Y' Y[0](~,I[1](VP'[0](~,~,~),~,~),~)
wrap2.aut
4
0 ~
that C ~
Alice DP ~
does I ~
like V ~
Bob DP ~
seem V ~
it DP ~
Y Y ~
CP CP ~
Cb Cb ~
IP IP ~
Ib Ib ~
DP DP ~
VP VP ~
S S ~
T T S(~,~,CP[2](~,DP[1](Cb(~,C[1](Y(~,I[1](VP(~,V[1](DP,~,~,~),~,~),~,~,~),IP[0](~,DP[1](Ib,~,~,~),~,~),~),~,~,~),~,~),~,~,~),~,~),~)
Cb Cb Cb(~,~,Cb[3](~,DP[1](Y(~,I[1](VP(~,V[1](Cb,~,~,~),~,~),~,~,~),IP[0](~,DP[1](Ib,~,~,~),~,~),~),~,~,~),~,~),~)
after embed():
wrap2.embed.aut
4
0 ~
tau tau ~
Y'' Y'' Y'[0](~,~,IP[0](~,DP[1](Ib[0](~,~,~,~),~,~,~),~,~),~)
does I ~
Y'' Y'' Y'[0](~,~,IP[0](~,DP[1](Ib[0](~,~,~,~),~,~,~),~,~),~)
Cb' Cb' tau[0](~,~,Cb[0](~,C[1](Y''[0](~,~,~,~),~,~,~),~,~),~)
T T S[0](~,~,CP[1](~,DP[1](Cb'[0](~,~,~,~),~,~,~),~,~),~)
like V ~
DP DP ~
that C ~
Cb Cb ~
VP VP ~
IP IP ~
VP' VP' tau[0](~,~,VP[0](~,V[1](DP[0](~,~,~,~),~,~,~),~,~),~)
Alice DP ~
S S ~
Cb Cb Cb[0](~,~,Cb[1](~,DP[1](Y''[0](~,~,~,~),~,~,~),~,~),~)
Bob DP ~
Y' Y' tau[0](~,~,Y[0](~,I[1](VP'[0](~,~,~,~),~,~,~),~,~),~)
it DP ~
seem V ~
Y' Y' tau[0](~,~,Y[0](~,I[1](VP'[0](~,~,~,~),~,~,~),~,~),~)
Y Y ~
VP' VP' tau[0](~,~,VP[0](~,V[1](Cb[0](~,~,~,~),~,~,~),~,~),~)
CP CP ~
Ib Ib ~
Traditionally, the context-free grammars are represented as a set of
string replacement rules: given a symbol alphabet $\sigma$, some initial
symbol $I$, a set of terminal symbols $F$, and some set of string
rewriting rules $V$ (each of which map a symbol, $x \in \sigma$, to a
string of symbols, $y \in \sigma^*$), a string $s$ is in the language if
and only if $s \in F^*$ and $s$ can be generated by one or more
successive applications of the rules in $V$.
Rule 1: S -> S S
Rule 2: S -> ( S )
Rule 3: S -> ( )
String: (())()()
One possible parse:
S (Initial symbol)
S S (Rule 1)
( S ) S (Rule 2)
( S ) S S (Rule 1)
( ( ) ) S S (Rule 3)
( ( ) ) ( ) S (Rule 3)
( ( ) ) ( ) ( ) (Rule 3)
Figure 1.1:
Rewriting with a simple 2-dimensional grammar
Equivalently, one can represent the context-free grammars using
two-dimensional trees. Let $V$ be a set of local trees, each of which
is a one-dimensional ordering of symbols in $\sigma^*$, dominated in the
second dimension by a symbol in $\sigma - F$. Let each member of $F$ be
the one-dimensional root of an empty local tree. Let the initial symbol
$I$ be the one-dimensional root of an empty derivation tree.
Context-free generation can then be performed by adjoining local trees
in $V$ to the derivation tree - by replacing some one-dimensional leaf
in $I$ with a two-dimensional local tree in $V$ one or more times. The
derived string, at any point in the process, is the one-dimensional
yield of $I$.
Rule 1: S
/
S---S
Rule 2: S
/
(-S-)
Rule 3: S
/
(---)
One possible parse, with labeled roots:
(1) S
/
(2) S---------S (1)
/ /
/ (3) S------S (3)
/ / /
(3) (--S--) (---) (---)
/
(---)
One-dimensional yield:
(( )) ( ) ( )
Figure 1.2:
Figure 1.1 using local trees in left-child/right-sibling form
The tree adjoining grammars (TAG) raise the context-free grammars into
three dimensions. A rewriting rule in TAG maps some one-dimensional leaf
node to a two-dimensional tree.
We have been running rec with some more complex 3- and 4-d trees taken from the figures in Wrapping of Trees. We have two automaton and accompanying trees recognizing the strings "Who does Carol think Alice does like?" and "It does seem that Alice does like Bob". We also tried to do a more generic automaton but we don't understand the theory behind the syntax so we weren't able to construct a grammar for it.
Automaton 1
Tree 1
Automaton 2
Tree 2
It is useful for our CKY-style parsing algorithm to be able to handle arbitrary-branching trees of arbitrary dimensions as well as the binary-branching trees of arbitrary dimensions that it already handles. However, the straight-forward approach, which would be to use n-dimensional state trees paired with labels as the domain of the delta function of our automaton, will cause the time complexity to blow up exponentially as the branching factor increases. To reduce the time complexity of the CKY-style parsing algorithm, and also to be able to simply adapt the arbitrary-branching automata to our existing binary-branching code, it is desirable to "embed" the arbitrary-branching automaton rules into binary-branching automaton rules.
Thanks to our lemma, a binary-branching, n-dimensional tree in left-child right-sibling form can be viewed as a simple array of n+1 nodes, where the (i+1)-st node (indexed starting at 0) is the (n-i)-th dimensional successor of the i-th node. The node indexed at i can have no successor in any dimension less than i because the lemma forbids it, and it can have no successor in any dimension greater than i because the node already has a parent in each such dimension, and so the inclusion of such a successor would make the tree more than binary-branching.
Thus, to embed an arbitrary-branching tree into a binary-branching structure, we can simply iterate along the proper binary-branching structure while checking for illegal links. When such links are found, we replace the parent node of the link with a new, unique label, and we introduce a new rule rooted in that same label. The tree for the new rule consists of the original parent node of the link along with its yield in that link. If the dimension of that link is less than n-1, then the higher-dimensional parent nodes in the tree are padded out by using a special node value which we shall call tau. Tau is not permitted to be the root or head of any rule in the automaton, and it is understood that a tau node is not part of the tree but is intended to help us "excavate" or reconstruct the arbitrary-branching structure after parsing is complete. Because tau is never the root or the head and occurs only in binary-branching rules, we can guarantee that each tau node will always have exactly one successor. The arbitrary-branching structure will be excavated in two steps: first, each and every tau node is replaced with its successor. Second, each uniquely labeled node that was generated in the embedding is replaced with its n-dimensional successor. Once the new rule is generated, it then must be recursively embedded until it too is reduced to a binary-branching structure.
Example embedding of a 3-dimensional tree:
Example embedding of a 4-dimensional tree (the dashed arrows indicate that the process continues until the tree is fully embedded):
We also need to preserve the heads of the yields when embedding the structure, which complicates things a bit. When a new tree is introduced from an i-dimensional link, its heads in dimensions less than i will be unaffected, so they can be imported directly from the pre-existing structure. In addition, the head offsets in dimensions greater than i must all be 1, as a value of 0 would make a a tau node a head, which is not permitted. In dimension i, we find that the head offset will simply be one less than the i-dimensional head offset of the pre-existing structure, as we have effectively cut off the first node in the i-dimensional local yield. If the new head offset is -1, then the first node in the pre-existing i-dimensional local yield was the head, and so the i-dimensional head of the new rule is irrelevant, as nothing will ever be hung beneath it. In this case, if we must designate a head, we will simply use a head offset of 0.
The heads of the pre-existing structures may change as well. In this case, we can simply do a bottom-up traversal of the binary-branching structure after it is fully embedded. In this traversal, we simply check for head offsets that are greater than 1, and when we find them, we reset them to 1. This keeps the head pointing to the node where the removed subforest will be attached, so that the head information can be reconstituted later.
In the second step of the tree excavation, we can recover the original heads by adding together the i-dimensional head offsets of the adjoined structures, where i is the dimension of the link that was originally removed in the embedding process. The sum will be the new i-dimensional head offset of the adjoined structure, which will be the same head offset used in the the arbitrary-branching rule that produced the adjoined structure.
automaton.embed(label,(n-1)-dimensional tree t, state) {
for dim from n-1 to 0 {
for i from dim+1 to n {
if (t has link i) {
restate t as unique t'
u = filltree(t.link(i),i,tree.id_head_offset(i)-1)
embed(unique label T', u, t')
t.unlink(i)
}
}
t = t.link(dim) if dim > 0
}
fix_head(tree,n-1)
add(label,tree,state)
}
filltree(n-dim tree t, dim, offset) {
tree.set_head_offset(offset,dim)
for i from dim to n-1 {
t = new tree with t as 1-dimensional child
t.setstate(tau)
}
return t
}
findheads(n-dimensoinal tree t, ydim) {
if (ydim == 0) {
set_head_offset(0)
return
}
fixheads(tree.link(ydim),ydim-1)
if( tree.get_head_offset() > 1 ) {
tree.set_head_offset(1)
}
}
In the project algorithm, we moved the loop that handles the projecting of the children of the current node because the existence of an n-dimensional link would cause the children to be moved to a point lower in the tree, and handled again later on in the tree. This was creating an inifinte loop. The modification is emphasized below. Now we only recur on the children if there is no n-dimensional link to be handled. If there is, we do not recur because we will return to these children later in the process and eventually deal with them.
project(tree) {
if(tree == empty) {
return tree;
}
moved
if(tree has an nd link) {
if(tree has (n-1)-dimensional link)
find_foot(tree.link(n)).link(n-1)=tree.link(n-1)
for i from n-2 to 1
tree.link(n).link(i) = tree.link(i)
tmp = project(tree.link(n))
set all tree links to NULL
delete tree
tmp.set_dim(n-1)
return tmp
}
else
for i from 1 to n-1
tree.link(i) = project(tree.link(i))
tree.set_dim(n-1)
return tree
}
We have completed code to implement an n-dimensional tree recognizer which now takes heads into account. We also have coded a projection algorithm which converts an n-dimensional headed tree into an (n-1)-dimensional headed tree. Here are some examples:
Original 3d tree:
S(~,~,S(~,e(~,~,~),S(~,a[1](X(~,~,S(~,Y(c(~,~,~),~,S(~,b[1](S(~,~,~),~,~),S(~,a[1](X(~,~,S(~,Y(c(~
,~,~),~,S(~,b[1](S(~,~,~),~,~),S(~,a[1](X(~,~,S(~,Y(c(~,~,~),~,S(~,b[1](S(~,~,~),~,~),~)),~)),~,~),~))
),~)),~,~),~))),~)),~,~),S(~,a[1](X(~,~,S(~,Y(c(~,~,~),~,S(~,b[1](S(~,~,~),~,~),~)),~)),~,~),~))))
1-d (string) projection:
a0[1](b0[0](a0[0](a0[0](a0[0](b0[0](b0[0](b0[0](e0[0](c0[0](c0[0](c0[0](c0[0](~)))))))))))))

And the n-dimensional recognizer output using the example tree and automaton from our original attempt at an n-dimensional recognizer:
SF[0](~,~,SS[0](~,eE[0](~,~,~),SS[0](~,aA[1](XX[0](~,~,SS[0](~,YY[0](cC[0](~,~,~),~,SS[0](~,bB[1]
(SS[0](~,~,~),~,~),SS[0](~,aA[1](XX[0](~,~,SS[0](~,YY[0](cC[0](~,~,~),~,SS[0](~,bB[1](SS[0](~,~,~)
,~,~),SS[0](~,aA[1](XX[0](~,~,SS[0](~,YY[0](cC[0](~,~,~),~,SS[0](~,bB[1](SS[0](~,~,~),~,~),~)),~)),~
,~),~))),~)),~,~),~))),~)),~,~),SS[0](~,aA[1](XX[0](~,~,SS[0](~,YY[0](cC[0](~,~,~),~,SS[0](~,bB[1]
(SS[0](~,~,~),~,~),~)),~)),~,~),~))))

For each n-dimensional link replace node with its (n-1)-dimensional local yield, the node's (n-1)-dimensional successor is now the (n-1)-dimensional successor of the maximal node in (n-1)-dimensional spine of the yield. Because it is the maximal node in the (n-1)-dimensional spine it cannot have an (n-1)-dimensional successor and it is safe to append nodes below it. However, we are still permitted to add an (n-1)-dimensional successor to it because we are simply extending the (n-1)-dimensional spine by appending an (n-2)-dimensional yield. This is permitted to be a (n-1)-dimensional successor because it is an (n-1)-dimensional successor prior to moving it.
All other successors (less than (n-1)-dimensional) of the node become successors of the root of the (n-1)-yield. Since it is the root of an (n-1)-dimensional yield it doesn't already have any j dimensional successors for j < n-1. Also, after replacing its predecessor node it will be the same type of successor that its predecessor was, and because of this it is valid to move the successors from the node to the root of the yield.
When we replace a node with its n-dimensional link, the new node, which was an n-dimensional successor, is now either a root or an i-dimensional successor, n < i < 1. Since it was an n-dimensional successor, the least dimension of its pre-existing children was n-1, so it is therefore permitted to be an (n-1)-dimensional root or an i-dimensional successor, n-1 < i < 1.
project(tree) {
if(tree == empty) {
return tree;
}
for i from 1 to n-1
tree.link(i) = project(tree.link(i))
if(tree has an nd link) {
if(tree has (n-1)-dimensional link)
find_foot(tree.link(n)).link(n-1)=tree.link(n-1)
for i from n-2 to 1
tree.link(n).link(i) = tree.link(i)
tmp = project(tree.link(n))
set all tree links to NULL
delete tree
tmp.set_dim(n-1)
return tmp
}
else
tree.set_dim(n-1)
return tree
}
Tomorrow we will think about why the last two definitions work, and then continue from there.
Representing Multidimensional Trees v0.3
We have built an arbitrary dimensional tree recognizer (over non-headed structures) and a 3-dimensional CKY string recognizer (with heads built in).
In looking at adding heads to higher dimensional structures, we have begun to more precisely formalize our abstract notion of multidimensional trees (headed or not). Both the internal form and the linearized external form fall out of the abstract structure, one as an implementation using pointers, the other as, essentially, the term algebra associated with the structure-building operation of our inductive definition.
We had been referring to the generalized "left-child/right-sibling" representation of trees as "threaded", but this clashes with existing use. For now let's refer to the 2-dimensional trees simply as "left-child/right-sibling" trees. Once we make the observation that a right-sibling for the root converts the tree to an ordered forest we can refer to these as "ordered (headed) forests". (With the exception of the references to "threaded form", this is consistent with what you have written so far.)
The generalization will be "multi-dimensional ordered (headed) forests". In establishing the type of the structure building operation for the general case we have to pick out those forests which can possibly be the set of subtrees of a
point with respect to a given dimension i. These sets contain full d-dimensional trees, but must have a unique minimum with respect to the ordering in the (i-1)-dimension. (Which is to say the initial point in the forest must be a (i-1)-dimensional root---it must have no (i-1)-dimensional parent and, hence, no successors in any dimension less than (i-1).) This leads to the notion of (i,d)-forests, d-dimensional forests in which the initial point is an i-dimensional root. These are the structures described in Alex's post, although I suspect we will eventually converge on a more concise way of putting it. (Not to fault the way it is expressed, this is as good as we've got so far.)
Here is what we have so far:
Representing Multidimensional Trees
(Appended to the previous post)
It would be useful to develop some sort of concept for the heights of these
trees. Using our approach, each node could be annotated with its height in
each dimension. We define the i-dimensional height of a node to be the maximum
depth under that node in an i-dimensional structure.
In order to calculate the height of a particular node for the i dimension, we
must take the maximum of the heights in that dimension of all the node's
i-dimensional children -- the nodes reachable by following zero or more
(j < i)-dimensional links from its i-dimensional successor -- and add one. For
the purpose of this calculation, the empty tree is taken to have a height of
-1.
The downside to this calculation is that it requires us to traverse an entire
(i-1)-dimensional forest in order to determine the i-dimensional height of a
single node. By annotating the nodes using a pair of heights per dimension
instead, we can propagate all the necessary information up the tree and thereby
skirt this problem. In each pair, the first i-dimensional height (the general
height) represents the height of this node from the perspective of a parent --
that is, the i-dimensional height obtained by following any links of dimension
≤ i. The second i-dimensional height (the specific height) represents the
actual height of this node -- that is, the i-dimensional height obtained by
following the node's i-dimensional link followed by any links of dimension ≤
i.
When a new node is constructed, its specific height in dimension i is
determined simply by taking the general height of dimension i for it's
i-dimensional successor, and adding 1 to that value. For the purposes of this
calculation (and for the calculation of the general heights as well), the
height of the empty tree is taken to be -1 in all dimensions.
In contrast, when propagating the general heights to a new node along an
i-dimensional link, two things happen. First, the general height of dimension
i is increased by 1, as in the calculation of the specific height for dimension
i. Second, since we are following an i-dimensional link, all general heights
for dimensions less than i are zeroed out.
Since a node may potentially be formed using links in more than one dimension,
this process must be carried out independently for each successor. The new
general height in each dimension is then obtained by taking the maximum value
in that dimension over all of the node's successors.
The classic representation of 2-dimensional trees, a parent with a set of children, does not fit well into a generalized form for a tree in higher dimensions. Since a 2-dimensional tree has arbitrarily many 2-dimensional children that have a 1-dimensional ordering, a node in a 3-dimensional tree would have arbitrarily many 3-dimensional children that have a 2-dimensional ordering. For an n-dimensional tree, a node would have arbitrarily many n-dimensional children with an (n-1)-dimensional ordering.
Using a "threaded form" for trees allows us to simply add one child per node for every dimension in the tree. Instead of maintaining arbitrarily many n-dimensional children, nodes in the threaded form contain a single child for every dimension. This creates a "left child, right sibling" relationship for 2-dimensional trees, where a root is the parent of a linearly ordered forest. The construction becomes generalized to two operations: adding another sibling to a forest and adjoining a forest under one root node.
This allows not only for arbitrarily many dimensions, but also for constructing arbitrary branching structures. So for a 2-dimensional tree, a node would have a 2nd and 1st dimensional child. The second dimensional child would be the first tree in a linearly ordered forest, while the first dimensional child would be a sibling in that node's own forest. The same structure holds for a 3-dimensional tree, but the children are now organized in a two dimensional tree rather than a linear structure.
Generalizing the two tree-building operations to n dimensions, however, requires that we create a new, generalized operation. Instead of first adding siblings to create a forest, and then assigning the forest a root node, we will do this in one step. Taking a label and two forests, we will create a new node that is the root of the first forest and the leftmost sibling of the second forest. We can then generalize this operation to one that takes d d-dimensional forests and a label and combines them under a new node of that label to form another d-dimensional forest. The formation of roots is still supported by using the empty tree as arguements in the operations. In two dimensions, a 2-dimensional root is formed if there is no 1-dimensional child. Otherwise, it is a sibling. For n-dimensions, an i-dimensional root is formed if there is no j-dimensional child, where j < i. Otherwise, a sibling is formed.
The linear form of threaded trees follows directly from their structure as well as how they are interpreted. Each node may only have one successor in each dimension and each of these successors is used to interpret the yield of the tree rooted at the node. The representation reflects this in the fact that each node is followed by a list of its successors within parentheses. This can be seen as the adjoining operation with the terms within the parentheses representing the forests which are being combined and the outside label representing either a new sibling or root forest.
The linear form is defined inductively as follows: