Lambda the Ultimate

inactiveTopic Folding over trees
started 7/29/2001; 8:26:05 AM - last post 8/6/2001; 10:11:09 AM
Ehud Lamm - Folding over trees  blueArrow
7/29/2001; 8:26:05 AM (reads: 1800, responses: 5)
Folding over trees
In addition to those mentioned in the above reference:

Graham Hutton, Fold and Unfold for Program Semantics, Proc. ICFP'98, 1998, pp. 280-288.

Tim Sheard and Leonidas Fegaras, A fold for all seasons, Proc. FPCA'93, pages 233--242, is good -- but highly theoretical. It is to be read after the previously-mentioned papers.

(Thanks to Oleg for pointing these out)

Posted to functional by Ehud Lamm on 7/29/01; 8:30:04 AM

jon fernquest - Re: Folding over trees  blueArrow
8/2/2001; 3:04:40 AM (reads: 1880, responses: 0)
I'm experimenting with higher order fold functions over trees to accomplish practical tasks like a directory tree walker.

(to spot directories/folders with high disk space usage, basically depth-first postfix rightmost n-ary tree traversal with upwards accumulation, see below).

I'm building a foundation of code snippets in Scheme. I'm keeping my experiments here:

Note: that the CAML (ML) book "The Functional Approach to Programming" Cousineau and Mauny has a lot of great stuff with folds but they use different names: it_list = fold_right, list_it = fold_left (reverse), gentree_hom = foldTree.

They're typology of n-ary tree traversal is helpful. For the tree:

(node a (leaf b) (node c (leaf d) (leaf e)) (leaf f)) or (a (b) (c (d) (e)) (f))

breadth-first leftmost: (a b c f d e) breadth-first rightmost: (a f c b e d) depth-first prefix leftmost: (a b c d e f) depth-first prefix rightmost: (a f c e d b) depth-first postfix leftmost: (b d e c f a) depth-first postfix rightmost: (f e d c b a)

1. prefix/postfix means parent visited before/after children 2. leftmost/rightmost means children visited from left-to-right/right-to-left.

I really like "A little Java, A Few Patterns" 's mix of functional programming and object-oriented programming with design patterns. Trying to use this in the directory tree walker above.

Cheers, Jon Fernquest

jon fernquest - Re: Folding over trees  blueArrow
8/3/2001; 12:35:56 AM (reads: 1847, responses: 0)
Here's all the depth-first ways to traverse n-ary trees written simple in PLT Scheme:

Still working on integrating the upwards accumulation part (or rollup) into a fold function.

Oleg - Re: Folding over trees  blueArrow
8/3/2001; 5:46:30 PM (reads: 1833, responses: 0)
I'm afraid you have only three choices:
  • use a second-order fold
  • use an enhanced fold-like operator such as foldts
  • give up on higher-order combinators and use recursion directly

The second-order fold is very inefficient -- it has large latency and it creates as many closures as there are nodes in the tree. The second-order fold basically replaces the tree of nodes with the tree of closures. After the latter is constructed, it is applied to the accumulator.

folts is defined as follows:

data Tree = Leaf String | Nd [Tree]

foldts fdown fup fhere seed (Leaf str) = fhere seed str foldts fdown fup fhere seed (Nd kids) = fup seed $ foldl (foldts fdown fup fhere) (fdown seed) kids

Example: computing an MD5 digest of a tree
md5Init:: MD5Context
md5Update:: String -> MD5Context -> MD5Context
md5Final:: MD5Context -> String

tree_digest2 = md5Final . (foldts fd fu fh md5Init) where fh ctx str = md5Update "/leaf" $ md5Update str $ md5Update "leaf" ctx fd ctx = md5Update "node" ctx fu _ ctx = md5Update "/node" ctx

Finally, the breadth-first tree traversal:

breadth_first = foldr (x seed -> foldl (flip (:)) seed x) [] . foldts fdown fup fhere [[]]
       fhere (level_here:levels_below) str = (str:level_here):levels_below
       fdown (_:[]) = []:[]
       fdown (_:levels_below) = levels_below
       fup (level_here:levels_below) new_levels_below =  level_here:new_levels_below

For example,

breadth_first (Nd [Leaf "0", Nd [Leaf "11", Leaf "12"], Leaf "2"])
evaluates to

Note that

	foldr (x seed -> foldl (flip (:)) seed x) []
is an efficient implementation of
	concat . map reverse

You can't efficiently implement the breadth_first with the first-order fold (and the second-order fold is inefficient). This was the subject of Gibbons and Jones' paper.

jon fernquest - Re: Folding over trees  blueArrow
8/4/2001; 6:53:53 AM (reads: 1833, responses: 0)
Thanks. I'm going to switch gears and rethink this all in ML or Haskell digesting all that you've presented to-date to get my bearings straight. later translate it to Scheme (with its implicit types given by s-expressions).

I was looking at your XML query language. I'm trying to duplicate the functionality of tgrep (grep for trees) hopefully with higher order functions:

Tree Grep looks for syntactic patterns in a parsed corpus, i.e. it filters out the s-expressions that meet certain constraints like dominance and precedence. Example:

Prepositional phrase attachment ambiguity:

1. I heard about the discussion after the meeting.

1a. heard after the meeting

(S (NP (PRO I)) 
  [VP (V heard) 
    [PP (PREP about) 
      (NP (DET the) (N discussion))] 
        [PP (PREP after) (NP (DET the) (N meeting))]]) 

1b. the discussion after the meeting

(S (NP (PRO I)) 
  (VP (V heard) 
    (PP (PREP about) 
      [NP (DET the) (N discussion) 
        (PP (PREP after) (NP (DET the) (N meeting)))]))) 

A fold with state implementing some sort of automata over a tree?


Oleg - Re: Folding over trees  blueArrow
8/6/2001; 10:11:09 AM (reads: 1818, responses: 0)

Haskell syntax is concise indeed; sometimes it may appear too concise. It certainly takes time to get used to.

At first glance, tgrep strongly resembles XPath. It appears you can translate all tgrep queries into XPath (amended with a regular-expression match on element names or character data). The latter amendment is easy in SXPath.

Just for the reference, here are some other ways of querying trees

XML Query Languages: Experiences and Exemplars

XQuery 1.0: An XML Query Language
A "fold with a state" might work as well. We can think of the "foldts" as a transformer from one tree to another. The transformation can be set up in such a way that the branches that match the query are transformed to themselves, and the other branches are transformed to nothing. BTW, I used this approach in LaXmL:

To generate the table of contents I had to query for (Section n xxx) elements. The query was done by transforming the above elements into <li>xxx</li> fragments and everything else into nothing.