Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tree data structure algorithms in Scheme?

My level of knowledge so far is such that I have completed The Little Schemer without issue and I'm currently 70% through The Seasoned Schemer. I have in the back of my mind ideas for projects I would like to work on to gain some real-world experience working with Scheme, yet (perhaps because I've worked primarily with OO languages throughout my career) I still find myself wondering how one would tackle some fairly basic problems to an OO language in a functional language like Scheme.

Rather than shove all my questions in a single stackoverflow question, I'll just trickle them out over time and assume the pieces will fall into place so I actually don't need answers to other questions.

It's pretty clear that Scheme's thing is lists. Lists and lists of lists. I'm used to being able to store lists that contain "attributes" that can be retrieved quickly (i.e. hashes) and can be nested one inside the other.

Taking an example of passing around a list of files and directories in a file system, recursively, how would one approach something like this in Scheme? I guess you could pass a data structure of the form:

'(("foo" (("bar.txt")
          ("zip.txt")
          ("button.txt")))
  ("other.txt")
  ("one-more" (("time.txt"))))

Where each node is represented as the car of a list and its children represented as another list contained in the car of its cdr, thus the above is a tree structure for:

foo/
  bar.txt
  zip.txt
  button.txt
other.txt
one-more/
  time.txt

Or perhaps one would pass an iterator function that accepts a visitor instead that does some sort of deep-tree traversal? (Not entirely sure how that looks, in terms of knowing when you're switching directories).

Is there a general pattern when it comes to this sort of problem, not just directory trees, but tree structures (with attached meta data) in general?

Does this inevitably finish up being quite fiddly when compared with the object-oriented equivalent?

like image 980
d11wtq Avatar asked Dec 16 '22 22:12

d11wtq


2 Answers

It's pretty clear that Scheme's thing is lists.

Traditionally, yes. But since R6RS, there are also records with named fields, which make working with other kinds of data structures a lot easier. Practical Scheme implementations have had these for decades, but they were never standardized, so the syntax varies.

Is there a general pattern when it comes to this sort of problem, not just directory trees, but tree structures (with attached meta data) in general?

Yes, and with your idea of "iterator functions" you've hit the nail on the head. (Except that it would be even more elegant to define a macro instead, so you can say:

(for-each-label (x tree 'in-order)
  (display x))

Macros are created with define-syntax.)

like image 151
Fred Foo Avatar answered Jan 09 '23 21:01

Fred Foo


I see two parts in your question.

Part one: How to implement trees in Scheme.

In standard R5RS most tree implementations use vectors to represent the nodes in a tree. For a binary tree with one might define some helper functions:

(define (tree-left t) (vector-ref t 0))
(define (tree-right t) (vector-ref t 1))
(define (tree-value t) (vector-ref t 2))
(define (make-node l r v) (vector l r v))
(define (leaf? t) (eq? t #f))

In Schemes with structures/records the above is replaced with>

(define-struct tree (left right value))
(define (leaf? t) (eq? t #f))

If hierarkies of structures are supported one can write

(define-struct tree ())
(define-struct (leaf tree) ())
(define-struct (node tree) (left right value))

For Schemes that support pattern matching code that manipulate trees looks like earily like tree handling code in ML and Haskell.

I can recommend the section on binary search trees in HtDP: http://htdp.org/2003-09-26/Book/curriculum-Z-H-19.html#node_sec_14.2

Part two: How to handle directory traversals

There are several approaches. One common way is to provide a function that given a directory produces an function or sequence that can produce one element (file or subdirectory) at a time. This is avoids the need of storing a potentially huge file tree in memory.

In Racket one can use sequences:

#lang racket
(for ([file (in-directory "/users/me")])
   (displayln file))

There is no directory traversal mechanisms available in R5RS, but all the major implementations of course offer some way or another of doing so.

BTW: It is true, that Scheme provides very good support for single linked lists, but when implementing functional datastructures it is often better to use vectors or even better structures due to faster random element access.

like image 36
soegaard Avatar answered Jan 09 '23 20:01

soegaard