Array-------------------+ | "a" | "b" | "c" | "d" | +-----------------------+all four objects live at the same ``level.'' A loop is used to systematically process the contents of the structure, examining Cell 0, Cell 1, Cell 2, etc.
front Cell--+---+ Cell--+---+ Cell--+------+ rear | o-|--> | "a" | o-|--> | "b" | o-|--> | "c" | null |<--|-o | +---+ +-----+---+ +-----+---+ +-----+------+ +---+A loop is used to traverse the cells of the list, starting from the entry point and following the links embedded in the cells of the list.
Cell-----------------------+ | "a" | | Cell--------------+ | | | "b" | | | | Cell-------+ | | | | | "c" | | | | | | null | | | | | +----------+ | | | +-----------------+ | +--------------------------+A recursively defined method is used to systematically process the contents of the structure.
When a recursive data structure is designed so that each of its levels contains levels of only simpler complexity, then we say that the structure is an inductive data structure. (An example of a layered data structure that is not inductive is a list of infinite length.) In this course, we work only with inductive data structures.
Although it may seem initially awkward to organize objects into levels, the technique is valuable in practice, because it readily supports structures that can organically ``grow'' while a program executes.
We used class Cell to implement a data type of lists. It is time to define precisely what a list is. The classic name is a cons list (or conslist), and we describe it precisely by means of a set of definitional clauses:
If you wish, you can ``draw'' the inductive definition
Nil
Cons-----------------+ | h: Object | | t: +-Conslist-+ | | | ... | | | +----------+ | +--------------------+
Cons-----------------------+ | h: "a" | | t: Cons--------------+ | | | h: "b" | | | | t: Cons-------+ | | | | | h: "c" | | | | | | t: Nil | | | | | +----------+ | | | +-----------------+ | +--------------------------+is a picture of a Conslist that holds three string objects, "a", "b", and "c", within three levels of nested structure.
Since the pictures quickly get huge, we will often use the linear forms, e.g.,
Cons("a", Cons("b", Cons("c", Nil) ) )to indicate the list's nested structure.
The correctness of the structure's construction is is formally justified as follows:
Now, data-structure building is a kind of physical ``game'', where we start with some basic piece, e.g., Nil, and we place it along with an object, like "c", in a box --- a Cons box. We then place the Cons box plus another object, say "b", into another Cons box. And so on. This is how we build Conslists.
Later, when we want to retrieve the string objects we ``packed'' into the Conslist, we will have to ``open'' the Cons boxes, one level at a time.
Given the inductive definition of conslist, we must use Java programming phrases to mimick the definition. Here is the way we were doing it, without even knowing that we were doing it:
new Cell("a", new Cell("b", new Cell("c", null)))Of course, we can assemble this 3-level structure in increments, if we desire:
Cell x = new Cell("c", null); Cell y = new Cell("b", x); Cell z = new Cell("a", y);
When a data type is defined with an inductive definition, computation on elements of the data type is also defined inductively (recursively), and this is implemented by means of a recursively defined method.
The idea goes as follows: Since there are two forms of Conslists, then we should have two recipes for processing a Conslist---one for the Nil-structure and one for the Cons-structure. We might write each ``recipe'' as an equation, from algebra, like this:
process( Nil ) = return ...some simple answer... process( Cons(h, t) ) = ... use recursion to compute t_answer = process(t); return ...an answer built from h and t_answer...Here is the lengthOf example specified in equational style:
/** length of a ConsList: */ lengthOf( Nil ) = 0 lengthOf( Cons(h, t) ) = 1 + lengthOf(t)This set of equations, one per clause in the inductive definition, describes the computational steps needed to descend into the levels of a conslist object so that we can compute its length (or, if you will, its depth).
The above schema is mechanically reformatted into a Java method when we use null and class Cell to implement a conslist:
public int lengthOf(Cell l) { int length; if ( l == null ) { length = 0; } else { length = 1 + lengthOf(l.getNext()); } return length; }Indeed, for inductively defined data structures, the equational-schema format gives us a fool-proof algorithm for processing the data structures!
lengthOf(Nil) = 0 lengthOf( Cons(h, t) ) = 1 + lengthOf(t)and here is the calculation, like one would do in algebra class:
lengthOf( Cons("a", Cons("b", Cons("c", Nil))) ) because the argument has form, Cons(...,...), use the second equation: = 1 + lengthOf( Cons("b", Cons("c", Nil)) ) again, use the second equation: = 1 + 1 + lengthOf( Cons("c", Nil) ) = 1 + 1 + 1 + lengthOf( Nil ) the first equation applies here: = 1 + 1 + 1 + 0 = 3Here is a two-dimensional drawing of the above calculation; the drawing shows how the recursive style of data-structure processing descends into the Conslist structure while it calculates its answer:
lengthOf( Cons-----------------------+ | "a" | | Cons--------------+ | | | "b" | | | | Cons-------+ | | | | | "c" | | | | | | Nil | | | | | +----------+ | | | +-----------------+ | +--------------------------+ ) = 1 + lengthOf( Cons--------------+ | "b" | | Cons-------+ | | | "c" | | | | Nil | | | +----------+ | +-----------------+ ) = 1 + 1 + lengthOf( Cons-------+ | "c" | | Nil | +----------+ ) = 1 + 1 + 1 + lengthOf( Nil ) = 1 + 1 + 1 + 0 = 3
How does the above reasoning translate into Java programming? Once again, here is the coding of lengthOf in Java:
public int lengthOf(Cell l) { int length; if ( l == null ) { length = 0; } else { length = 1 + lengthOf(l.getNext()); } return length; }We might well ask: Does the execution of the Java coding construct the graphical structures shown in the above drawings? Well, not exactly---recall that computer heap storage is ``flat'' and nonnested. Remember that we have been using Cells to represent such conslists. Hence, a three-level nested structure like Cons("a", Cons("b", Cons("c", Nil))) is in fact mimicked by three separate cells (and null) that are linked together with storage addresses:
a4 : Cell----+ a3: Cell----+ a2: Cell---+ | "a" | | "b" | | "c" | | a3 | | a2 | | null | +------------+ +-----------+ +----------+Remember also that a series of recursive-method invocations are modelled with the activation-record stack in the Java Virtual machine. Thus, an execution configuration like this one:
1 + 1 + lengthOf( Cons("c", Nil) )or drawn graphically,
1 + 1 + lengthOf( Cons-------+ | "c" | | Nil | +----------+ )shows the situation where a list is partially counted due to 3 recursions. Recall from the previous lecture that, within the Java Virtual Machine, the activation-record stack looks like this: (Note: the stack is tipped on its side so that it is growing from left to right.)
top | +--------------------------------------------------V----- | +---------------+ +---------------+ +-----------+ | | l == a4 | | l == a3 | | l == a2 | | | length = 1 + ?| | length = 1 + ?| | | | | ... | | ... | | ... | | +---------------+ +---------------+ +-----------+ +--------------------------------------------------------The activation-record stack shows that lengthOf has started three times, and the most recent activation is trying to count the length of the list at address a2. Once the a2-list length is counted, then the answer, an integer, will be returned to the caller, which adds one to it, giving the length of the a3 list. That answer is returned, to its caller, which adds one, giving the length of the a4-list.
Again, please review the previous lecture to see how the Java Virtual Machine uses an activation-record stack to compute recursive method calls.
In summary, the equational calculations and two-dimensional drawings give us powerful design and reasoning tools that are more elegant than but nonetheless consistent with the actual computer implementation. When trying to solve a complex data-structure problem, it is often helpful to visualize the solution as a graphical computation on the nested, recursive data structure.
If you are interested, here is another standard example written in the recursive style:
/** toString assembles a string representation of a Conslist: */ toString(Nil) = "" toString( Cons(h, t) ) = h.toString() + " " + toString(t)It is easy to reformat this example into a recursive Java method:
public String toString(Cell l) { String answer; if ( l == null ) { answer = ""; } else { answer = l.getVal().toString() + " " + toString(l.getNext()); } return answer; }
Both lengthOf and toString are simple examples of recursive processing that traverses all elements of a conslist. Of course, we know that we can traverse a list with a mere while-loop. Are there patterns of list processing that are not merely mimicking loops? Yes---here are two:
/** append accepts two conslists as arguments and * build sa new list that has the contents * of the two, appended together */ append(Nil, ys) = ys append(Cons(h, t), ys) = Cons(h, append(t, ys))The above pattern is worth pondering---append(list1, list2) says we should build the appended list by descending into the innermost structure of list1, level by level, attaching list1's elements onto the front of list2 as we go. It is enlightening to see this in a calculational trace:
append( Cons("a", Cons("b", Nil)), Cons("z", Nil) ) = Cons("a", append( Cons("b", Nil), Cons("z", Nil) )) = Cons("a", Cons("b", append( Nil, Cons("z", Nil) ))) = Cons("a", Cons("b", Cons("z", Nil)))Notice how the Cons structures are enclosing the recursive invocations.
It is fun to redraw the calculation graphically:
append( Cons--------------+ Cons-------+ | "a" | | "z" | | Cons-------+ | | Nil | | | "b" | | +----------+ ) | | Nil | | | +----------+ | +-----------------+ , = Cons--------------------------------------------+ | "a" | | append( Cons-------+ Cons-------+ | | | "b" | | "z" | | | | Nil | | Nil | | | +----------+ , +----------+ ) | +-----------------------------------------------+ = Cons--------------------------------------------+ | "a" | | Cons---------------------------------+ | | | "b" | | | | append( Nil, Cons-------+ | | | | | "z" | | | | | | Nil | | | | | +----------+ ) | | | +------------------------------------+ | +-----------------------------------------------+ = Cons---------------------------+ | "a" | | Cons----------------+ | | | "b" | | | | Cons-------+ | | | | | "z" | | | | | | Nil | | | | | +----------+ | | | +-------------------+ | +------------------------------+It is a bit amazing that we can replicate this systematic recursive descent in Java, but we can:
public Cell append(Cell list1, Cell list2) { Cell answer; if ( list1 == null ) { answer = list2; } else { answer = new Cell(list1.getVal(), append(list1.getNext(), list2)); } return answer; }The power comes from nesting the recursive invocation, append(list1.getNext(), list2) inside the use of new Cell( ... )!
Here is a question for you to consider: For this example,
Cell alist = new Cell("a", new Cell("b", null)); Cell blist = new Cell("z", null)); Cell clist = append(alist, blist);are either of alist or blist altered due to the use of append(alist, blist) to construct clist? The answer is no---indeed, clist and blist share the same objects, but the lists are not altered! (If you are uncertain of this, draw a picture of computer heap storage and work the example by hand.)
Here is a second example that employs a similar cleverness:
/** reverse builds a list that is the reversed version * of its argument */ reverse(Nil) = Nil reverse(Cons(h, t)) = append( reverse(t), Cons(h, Nil) )This one is a fun exercise for you to work for yourself:
public Cell reverse(Cell list) { Cell answer; if ( list1 == null ) { answer = null; } else { answer = append(reverse(list.getNext()), new Cell(list.getVal(), null)); } return answer; }A question: why must we use new Cell(list.getVal(), null) and not just list.getVal()?
Other data structures can be defined by means of inductive definition. Here are some classic examples:
A huge advantage of this form of file system is that it can grow as deeply as needed when the user adds more and more textfiles and folders.
Say that we want to write a program that counts all the textfiles held in a file system. How do we start? The equational specifications show us the way:
/** countFiles counts the number of textfiles in a file system */ countFiles( Textfile ) = return 1; countFiles( Folder(t1, ..., tm, f1, ..., fn) ) = subcounts = 0; for ( j in 1 to n ) subcounts = subcounts + countFiles(fj); return m + subcounts;The recursions into the subfolders neatly total the counts of textfiles in the subfolders, which we sum into the total count.
Admittedly, representing numbers as nested structures is a game, but the ``game'' motivates modern-day set theory and even the construction of computer circuits.
Here are some examples of processing natural numbers:
Checking if a number is even:
isEven(Zero) = true isEven(Succ(N)) = !isEven(N)An example: is 3 even?
isEven(Succ(Succ(Succ(Zero)))) = ! isEven(Succ(Succ(Zero))) = ! ! isEven(Succ(Zero)) = ! ! ! isEven(Zero) = ! ! ! true = ! ! false = ! true = false
Doubling a number:
double( Zero ) = Zero double( Succ(N) ) = Succ(Succ( timesTwo(N) ))Try it---say we compute double( Succ(Succ(Zero)) ):
double( Succ(Succ(Zero)) ) = Succ(Succ( double( Succ(Zero) ) )) = Succ(Succ( Succ(Succ( double(Zero) )) )) = Succ(Succ( Succ(Succ( Zero )) ))
Addition:
add( Zero, N ) = N, where N is any Nat-object whatsoever add( Succ(M), N ) = Succ( add(M,N) )An example of 3 + 2:
add( Succ(Succ(Succ(Zero))), Succ(Succ(Zero)) ) = Succ( add(Succ(Succ(Zero)), Succ(Succ(Zero)) ) = Succ( Succ( add(Succ(Zero)), Succ(Succ(Zero)) )) = Succ( Succ( Succ( add(Zero, Succ(Succ(Zero)) ))) = Succ( Succ( Succ( Succ(Succ(Zero)) )))
Multiplication:
mult( Zero, N ) = Zero mult( Succ(M), N) = add(N, mult(M, N))This definition exploits the arithmetical fact that multiplication is repeated addition. Here is 3 * 2:
mult( Succ(Succ(Succ(Zero))), Succ(Succ(Zero)) ) = add( Succ(Succ(Zero)), mult( Succ(Succ(Zero)), Succ(Succ(Zero)) ) ) = add( Succ(Succ(Zero)), add( Succ(Succ(Zero)), mult( Succ(Zero), Succ(Succ(Zero)) ) ))) = add( Succ(Succ(Zero)), add( Succ(Succ(Zero)), add( Succ(Succ(Zero)), mult(Zero, Succ(Succ(Zero)) )))) = add( Succ(Succ(Zero)), add( Succ(Succ(Zero)), add( Succ(Succ(Zero)), Zero))))At this point, we can apply the definition of add to compute the final answer, Succ(Succ(Succ(Succ(Succ(Succ(Zero))))).
In a similar way, all of arithmetic can be defined as recursively defined operations on Nat objects, and indeed, all mechanical computation can be formalized solely in terms of recursive programming patterns on Nat objects.
An example:
/** computing the decimal value of a BinaryNumber: */ valueOf(0) = 0 valueOf(1) = 1 valueOf([N]0) = 2 * valueOf(N) valueOf([N]1) = (2 * valueOf(N)) + 1As an exercise, you might write the equations for adding, multiplying, etc., binary numbers. The equations you write turn out to be one form of the wiring diagrams taught in circuit theory.
negation(false) = true negation(true) = false and(false, B) = false (for any boolean, B) and(true, B) = B (for any boolean, B) or(false, B) = B (for any boolean, B) or(true, B) = true (for any boolean, B)