Saturday, April 30, 2011

Iterative processes

There is a fundamental difference between a recursive process and a
recursive procedure.

A recursive procedure is one where the procedure calls itself (directly or indirectly) at some point.
In contrast a recursive process is one which has to remember its history of execution.

For eg.
; clojure code
(defn sum [a b]
(if (> a b)
0
(+ a (sum (inc a) b))))

This is small piece of code, written in Clojure, a lisp dialect, which sums up all
integers from a to b. The logic is quite straight forward.

If a > b then ans = 0 because there are no elements between them.
Else Find the sum of all numbers from (a+1) till b and then add 'a' to it.

Typically a function call gets its own stack frame which will be alive till the function
executes. After the function returns a value (potentially) the stack is popped.

So what would (sum 1 5) be?

Assuming stack grows from left to right, this is how to process would execute.

(sum 1 5)
(+ 1 (sum 2 5))
(+ 1 (+ 2 (sum 3 5)))
(+ 1 (+ 2 (+ 3 (sum 4 5))))
(+ 1 (+ 2 (+ 3 (+ 4 (sum 5 5)))))
(+ 1 (+ 2 (+ 3 (+ 4 (+ 5 (sum 6 5))))))
(+ 1 (+ 2 (+ 3 (+ 4 (+ 5 0)))))
(+ 1 (+ 2 (+ 3 (+ 4 5))))
(+ 1 (+ 2 (+ 3 9)))
(+ 1 (+ 2 12))
(+ 1 14)
15

Take your time to understand why the process looks like this.
[Substitute the parameters passed to the function sum at every point into the
definition and you will be able to figure out.
This model of finding out answers is called the substitution model]

In the above example what was the role that the stack played? Looking at the
process's structure we understand that the stack keeps track of the
pending operation at any point.
i.e. if there were no stack then the process would have looked like?

(sum 1 5)
(sum 2 5)
(sum 3 5)
(sum 4 5)
(sum 5 5)
(sum 6 5)
0 -> result

We got '0' because nobody remembers that there is an addition operation is to be done after the call returns.

But is it absolutely necessary to always have a stack?
If you say
"Yes, how else would I remember what operations are left to be done after the function call remembers?" then you have correctly identified the problem.

Now look at this code
(defn sum [a b acc] ;; acc should be 0 initially
(if (> a b)
acc
(sum (inc a) b (+ acc a)))

[Clojure's comments start with ';']

In the above procedure we have an extra parameter, accumulator, which will collect
the answer until that point. Here, using the substitution model, we arrive at the
process structure that looks like the following.

Assuming 'no-op' is used to signify nothing is left to be done.

(sum 1 5 0)
(no-op (sum 2 5 1))
(no-op (no-op (sum 3 5 3)))
(no-op (no-op (no-op (sum 4 5 6))))
(no-op (no-op (no-op (no-op (sum 5 5 10)))))
(no-op (no-op (no-op (no-op (no-op (sum 6 5 15))))))
(no-op (no-op (no-op (no-op (no-op 15)))))
(no-op (no-op (no-op (no-op 15))))
(no-op (no-op (no-op 15)))
(no-op (no-op 15))
(no-op 15)
15

We see that after the stage (sum 6 5 15) arrives, we essentially have found the answer.
All that is to be done now is that the result '15' is to be returned by every stack frame.
In this case, since the result is carried forward there is no real need for maintaining
the stack because there is no operation pending.

Note that as stack depth increases memory consumption of the process increases.
So stacks can't be ignore.
Languages like lisp, haskell etc do a bit of 'magic' called Tail Call Optimization
where when recursion is the last step of the procedure then they
reuse the current stack frame.

So (sum 1 5 0) would look like

(sum 1 5 0)
(sum 2 5 1)
(sum 3 5 3)
(sum 4 5 6)
(sum 5 5 10)
(sum 6 5 15)
15

All the operation would happen in the same stack frame, so memory consumption
is nearly constant.

Other languages like C++, Java etc make use of the 'for' , 'while' keywords and the
like to arrive at similar characteristics.
This type of process is called an iterative process where there is constant stack usage and all the state information is remembered through parameters alone.

[Note that if we had to do any operation with the result(15 in this case) of a function call at any point then that information can be maintained only in a stack]

No comments:

Post a Comment