June 21, 2004

Multi-Dimensional Tree Input/Output

I've written a readtree function for multi-dimensional trees. This function uses simple recursive descent parsing to convert multi-dimensional string representations of trees into their equivalent in-memory representations:
/* tree readtree(istream& ins, unsigned dimensions, const sobj&, const qobj&)
 * Reads a tree in string form from ins and builds it recursively
 * Returns NULL on error
 */
  
template<class sobj, class qobj,class siobj,class qiobj>
tree<siobj,qiobj>* do_readtree(istream& ins, unsigned dimensions,
                               LabelTable<sobj,siobj> &sTable, bool& fail)
{ 
  char c;
  sobj label;
  const char EMPTY_TREE = '~';
  tree<siobj,qiobj>* root = NULL;
  
  /* Read the node label */
  if (!(ins >> label) || label == EMPTY_TREE)
    return root;
    
  /* Instansiate a local tree using the newly-read label */
  if (!(root = new tree<siobj,qiobj>(sTable.obj_to_index(label))))
    return NULL;
    
  /* Read a left brace */
  if (!(ins >> c))
    { delete root; fail = true; return NULL; }
  
  /* Special case: Don't require that (~,~,~) appear for leaf nodes. */
  if (c != '(')
    { ins.putback(c); return root; }
  
  /* For children of each each dimensionality, starting with the first */
  for (unsigned dim = 0; dim < dimensions; dim++)
  {
    /* Recur to read a tree */
    root->set_link(
      dim, do_readtree<sobj,qobj,siobj,qiobj>(ins, dimensions, sTable, fail)
    );
      
    /* Read a comma */
    if (!(ins >> c))
      { delete root; fail = true; return NULL; }
    
    /* Special case: Treat omitted subtrees as empty. */
    if (dim >= dimensions - 1 || c != ',')
      { ins.putback(c); break; }
  } 
    
  /* Require a matching right brace */
  if (!(ins >> c) || c != ')')
    { delete root; fail = true; return NULL; }
    
  return root;
} 

template<class sobj, class qobj,class siobj,class qiobj>
tree<siobj,qiobj>* readtree(istream& ins, unsigned dimensions,
                            LabelTable<sobj,siobj> &sTable)
{
  bool fail = false;
 
  tree<siobj, qiobj> *root =
    do_readtree<sobj, qobj, siobj, qiobj>(ins, dimensions, sTable, fail);
  
  if (root && fail)
    { delete root; root = NULL; }

  return root;
} 
I've also written a simple function to do the reverse - convert an in-memory multi-dimensional tree representation to its string representation. I've called it uglyprint, in recognition of the fact that Ian's tree output code was clearly superior. Colin's been throwing some pretty large and complex three-dimensional test cases at this thing, and it's held up so far. Posted by brownda at June 21, 2004 03:06 PM
Comments

Input:
S(~,~,S(~,e(~,~,~),S(~,a(X(~,~,S(~,Y(c(~,~,~),~,S(~,b(S(~,~,~),~,~),S(~,a(X(~,~,S(~,Y(c(~,~,~),~,S(~,b(S(~,~,~),~,~),S(~,a(X(~,~,S(~,Y(c(~,~,~),~,S(~,b(S(~,~,~),~,~),~)),~)),~,~),~))),~)),~,~),~))),~)),~,~),S(~,a(X(~,~,S(~,Y(c(~,~,~),~,S(~,b(S(~,~,~),~,~),~)),~)),~,~),~))))

Output:
S(,,S(,e(,,),S(,a(X(,,S(,Y(c(,,),,S(,b(S(,,),,),S(,a(X(,,S(,Y(c(,,),,S(,b(S(,,),,),S(,a(X(,,S(,Y(c(,,),,S(,b(S(,,),,),)),)),,),))),)),,),))),)),,),S(,a(X(,,S(,Y(c(,,),,S(,b(S(,,),,),)),)),,),))))

Posted by: Colin at June 21, 2004 03:08 PM

It's occurred to me that the tree class doesn't have an operator delete that deletes recursively. As a result, this thing leaks memory when parse errors occur at a non-leaf node.

We want a seperate delete_recursive - there are legitimate cases where we want the non-recursive behaviour.

Probably not a high-priority, but I wanted to at least record the bug here.

Posted by: David at June 26, 2004 09:51 AM