Essence of Functional Programming for Imperative Programmers

by Shelby Moore III

C++, C#, Java, PHP, HaXe, etc. imperative programmers should be able to understand the Javascript examples herein.

Recursive Fibonacci (slow)

Compute an element of the natural numbers (positive integers) fibonacci sequence recursively (Javascript). C++ and others are here.

// Return 0-based element from sequence which is the sum of the previous two
fib( i ) {
	if (i < 2) return i;
	return fib( i-1 ) + fib( i-2 );
}

And in Haskell.

fib i = if (i < 2) then i else fib( i-1 ) + fib( i-2 )

Fix to guard for negative element indices.

fib i
    	| i  > 1 = fib( i-1 ) + fib( i-2 )
    	| i == 1 = 1
	| i == 0 = 0
	| i  < 0 = error "fib: negative index"

Iterative Fibonacci (faster)

To eliminate the duplicate recursion of the sequence from i-2..1 (and inability to accumulator parameter optimize for tail recursion), compute the sequence non-recursively in its forward direction (Javascript).

fib( i ) {
	for( t = 0, a = 0, b = 1; --i >= 0; ) {
		t = a;
		a = b;
		b += t;
	}
	return t;
}

Cache the computed list for the sequence of n elements (Javascript).

// Return list where each element is the sum of the previous two
fibonacci( n ) {
	for( a = 0, b = 1, list = new Array; --n >= 0; ) {
		list.push( t = a );
		a = b;
		b += t;
	}
	return list;
}

Lazy Evaluation in Non-Lazy Languages (fastest)

Eliminate duplicate computation of cached list when expanding the size of the sequence (Javascript).

fibonacci( n ) {
	if( !this.prototype.list ) this.prototype.list = [ 0, 1 ];	// static variable for list, [ 0, 1 ] = new Array( 0, 1 )
	list = this.prototype.list;					// reference, not copy
	if( n <= list.length ) return list.slice( 0, n );
	for( a = list[list.length-2], b = list[list.length-1], n -= list.length; --n >= 0; ) {
		list.push( t = a + b );
		a = b;
		b = t;
	}
	return list;
}

fib( i ) {
	return fibonacci( i + 1 )[i];
}

The above obscures, complicates, and conflates the fibonacci algorithm with code and algorithm for caching. Imagine the possible spaghetti for algorithms which are more complex. The solution is to express the sequence as the logical composition of lists, but for this it is impossible to re-factor Javascript to logically recurse fibs().

fibs( n ) {
	if (n == 0) return [ 0 ];
	if (n == 1) return [ 0, 1 ];				// [ 0, 1 ] = new Array( 0, 1 )
	fibs2 = tail( tail( fibonacci( n ) ) );
	return add( fibs2, tail( fibs2 ) ).unshift( 0, 1 );	// [ 0, 1 ].push() of each element of add(...)
}

// Return list with first element removed
tail( list ) {
	list.length && list.shift();
	return list;
}

// Return list that is element-wise addition of two lists
add( list1, list2 ) {
	for( list = []; list1.length && list2.length; )		// [] is new Array()
		list.push( list1.shift() + list2.shift() );
	return list;
}

Lazy Evaluation in Haskell (multi-threaded)

Haskell expresses it declaratively, with invisible caching (aka lazy evaluation) and theoretically forward iteration optimization of the recursion[1]. Visualize an arc with arrow tip from the leftmost fibs to the other two, then loop that in your mind incrementally forward starting from 0th element for each in the sequence.

fibs = 0 : 1 : zipWith (+) fibs() (tail( fibs() ))

Note that function calls, e.g. tail( fibs() ), should not have parenthesis except for grouping, e.g. (tail fibs), and never for multiple arguments (due to partial application and currying).

fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

Eliminate tail with as-pattern.

fibs @(1:tfibs) = 0 : 1 : zipWith (+) fibs tfibs

Make forward iteration explicit, and remove self referential space "leak".

fibs = iterate (\x -> [last x + last init x]) [ 0 : 1 ]

Or verbosely:

fibs = iterate (\x -> [fib -1 + fib -2] where fib i = | i==-1=last x | i==-2=last init x) [ 0 : 1 ]
	-- negative indices in local function fib offset from end of list

Or concisely:

fibs = 0 : scanl (+) 1 fibs

Alternatively using a short-hand syntax (aka syntactic sugar) for list comprehension, adds no benefits:

fibs = 0 : 1 : [ a + b | (a, b) <- zip fibs (tail fibs) ]

In all above, fibs is the fibonacci sequence (a list) formed iterably by the sum of itself and its tail. zip interleaves these recursive lists into a list of tuples, (a, b), because a <- (naturals n), b <- (tail (naturals n)) would instead produce a cross product. Alternatively, zipWith is a higher order function because it inputs a function, the (+) sum operator, which it applies directly to the recursive lists, eliminating the intermediate step of forming a tuple.

In a pure functional language with lazy evaluation, such as Haskell (when not using monads, e.g. do), we write is instead of returns, because fibs !! 0, is and only computes the first element of sequence-- the entire infinite sequence is never computed when fibs is composed in code. The list element index operator is, !!, and operators in parenthesis become functions, e.g. (!!) fibs i returns i-th element.

In the above example, the type of fibs is undefined and will be inferred from the use fibs, constrained by the set of types that the + operator accepts. A separate copy, of the evaluated portions of the sequence, will be cached for each inferred type. We can declare the type to be the list of (positve and negative) integers. Note it is impossible for the fibonacci algorithm to produce a negative integer, because the positive trending sequence in seeded with 0 and 1. Haskel does not have a type for natural numbers yet. Due to lazy evaluation, declaring fibs [Int] will only cause an error if evaluation computes an element of the sequence (list), with a value greater than 2 to the 32nd power. Or infinite-precision [Integer] may be used if performance is not critical.

fibs :: [Int]

Lazy Evaluation

As demonstrated for Javascript above, in non-lazy evaluation languages, extra code must be interleaved (i.e. co-mingled and conflated) into the core algorithm to pre-compute and cache the portions of the infinite list the algorithm can generate. Whereas, lazy evaluation means that at runtime only the portions of the list will be calculated, as they are needed. Lazy evaluation is allowed in pure functional languages, because the logic of a function is not dependent on execution-order, e.g. computation of fibs !! i for any i can occur at any time in the program. Lazy evalution requires referential transparency, a property of pure functional calculus, which means that functions are immutable, can not reference external values (only external functions), and every return (i.e. no closures on local variables) and argument of every function are passed by value, never by reference. Note pass-by-value does not impact performance, as pointers may be used by the compiler, because these values are always immutable.

Advantages

Pure functional enables some composability, cross-module, algorithm and compiler optimizations[1].

Allocation Size Determinism

Pure functional with lazy evaluation obscures virtual memory space allocation "leaks", because two functions may compile and behave algorithmically correct in all outcomes other than those derived from heap space allocation order and size.

repeat x = xs where xs = x : xs

{- List contains references to itself, thus lazy allocated size grows
   to maximum size shared by all references to this function. -}
repeat'leak x = x : repeat'leak x

leak'none x = take 1000000 repeat x

leak'million x = take 1000000 repeat'leak x

This enables orthogonality of core algorithm and cache space algorithm profiling + reasoning[5], can be selectively disabled[5], and is no worse than emulating lazy list cache resizing in non-lazy languages-- e.g. the fibonacci example in Javascript-- with better orthogonality and tools for the lazy default[6]. Space "leaks" are uncontrolled excessive caching that are re-usable in future runtime, thus unlike traditional memory leaks that can not be re-used (lost memory until program termination), impact performance relative to virtual memory paging versus re-computation speed. I have submitted an idea for runtime optimization to provide a more deterministic upper bound on paging load for cache control.

OOP

OOP and Base classes eliminated.

The following Haskell code:

data Maybe a = Nothing | Just a

showPet :: Maybe (String, String) -> String
showPet Nothing                 = "none"
showPet (Just (name, species))  = "a " ++ species ++ " named " ++ name

Is semantically equivalent to the following Copute code:

class Maybe {}
class Nothing( Maybe ) {}

showPet( _ : Nothing ) { return "none" }
showPet( ( Maybe )( name, species ) )
{
    return "a " + species + " named " + name
}

Monads

Categories are composable functions (morphisms) between types:

      ia :: a -> a                           -- identity functions
      ib :: b -> b
      ic :: c -> c

       f :: a -> b                           -- morphism functions
       g :: b -> c

     (.) :: (b->c) -> (a->b) -> (a->c)       -- functions composition, g.f = \x -> g (f x)

     g.f :: a -> c                           -- g (f x)
    f.ia :: a -> b
    ib.f :: a -> b

    ia x =  x
    ib x =  x
    ic x =  x
  f.ia x == f x                              -- right identity
  ib.f x == f x                              -- left identity
g.(f.ia) == (g.f).ia                         -- composition is associative

Monad maps to the category that has a parameterized type constructor:

                         unit :: t -> Type t                           -- identity parameterized constructor, aka 'return'

                           ft :: a -> Type b                           -- morphism functions
                           gt :: b -> Type c
                           ht :: a -> Type c

                        (>>=) :: Type b -> (b->Type c) -> Type c       -- parameterized composition, aka 'bind' or '*'
                                                                       -- (ft >>= gt) = \x -> (ft x) >>= gt

                         ht x =  \x -> (ft x) >>= gt                   -- ht x = (>>=) (ft x) gt

                          m x =  unit x
                         ft x == \x -> (m x) >>= ft                    -- right identity parameterized
                         ft x == \x -> (ft x) >>= m                    -- left identity parameterized
(\x -> ((m x) >>= ft) >>= gt) == \x -> (m x) >>= (\y -> (ft y) >>= gt) -- composition is associative

Right identity, left identity, and associativity of (>>=) means a Monad conforms to the pure functional calculus. Execution-order can be enforced using parenthesis for grouping, using $!, or the do notation, which always associates left-to-right m x >>= (\y -> (ft y) >>= gt).

Monads enable composability between functions that morph types to parameterized types-- necessary because gt.ft is a type mismatch-- without needing to write a new function Type a -> Type c for each permutation of otherwise mismatched composition (b -> Type c) . (a -> Type b). Monads are feasible in imperative languages, such as C# (examples of useful 'bind' functions). Restricted monads parameterize on types a and b.

Categories with inverses are an isomorphism, i.e. bijective.

    invf :  b -> a
f.invf x == ib x
invf.f x == ia x

State Monad

And thus we can transform any referentially transparent (i.e. no side effects) functional code into a mathematical equivalent that maintains the referential transparency, by propagating global state from the inner-most function branch tips to the main program trunk function:

   ft a =  unit f a
   gt b =  unit g b
g (f x) == ((unit x) >>= ft) >>= gt
g (f x) == (unit x) >>= ft >>= gt

If we define a non-Haskell bind operator <<= which is right-to-left associative:

  (<<=) :: (b->Type c) -> Type b -> Type c

g (f x) == gt <<= ft <<= (unit x)
g (f x) == (gt <<= ft) <<= (unit x)

Then to inject state[7] from the inner-most functions (branch tips) propagating down the tree to the program main trunk:

function unit( a )
{
    return function( X ) { return (a, X) }
}

function bind( k, m )
{
    return function( X )
    {
        (b, Y) = m( X );
        return (c, Z) = k( b )( Y );
    }
}

function ft( a ) { unit( f( a ) ) }
function gt( b ) { unit( g( b ) ) }

unit( g( f( x ) ) ) == bind( gt, bind( ft, unit( x ) ) )( initial_state )
unit( g( f( x ) ) ) == bind( bind( gt, ft ), unit( x ) ) )( initial_state )

In order to do anything useful, the example above can be modified to give f() and g() the environment of state x from the containing bind() function, and then optionally allow f() and g() to return (a, x) instead of a. Haskell accomplishes this with type inference, where f() and g() are inferred to have the type returned by unit() and thus unit() is called automatically by the compiler if f() or g() return a instead of (a, x). We can expand the above to:

unit( g( f( x ) ) ) ==
    function( X2 )                      // left-most bind
    {
        (b, Y) =
            function( X )               // right-most bind
            {
                return (f( x ), X)      // f() is allowed to modify X
            }( X2 )
        return (g( b ), Y)              // g() is allowed to modify Y
     }( initial_state )

unit( g( f( x ) ) ) ==
    function( X2 )                      // left-most bind
    {
        (b, Y) =
            function( X )               // right-most bind
            {
                return (g( f( x ) ), X) // g() and f() are allowed to modify X
            }( X2 )
        return (b, Y)
     }( initial_state )

Imperative language statements:

a = 3
b = foo()
k( f( a ), g( b ) )

Are equivalent to:

function c3() { return 3; }
function a() { return c3() }
function b() { return foo() }
function( x ) { return k( f( a ), x ) }( g( b ) )

And in Haskell lambda style (\c3 -> (\a -> (\b -> k f a g b) foo) c3) 3. Function composition employs lambda (aka currying) for functions that input more than one function, which is in Haskell k f a g b == k (f a) (g b) == (\x -> k (f a) x) (g b) or in imperative k( f( a ), g( b ) ) == function( x ) { return k( f( a ), x ) }( g( b ) ). Thus any program is equivalent to one composed of functions which take one function argument. Note that we can left-to-right associate any triple of functions in any order, e.g k f a can be (\x -> k.f x) a or k (f a) and similarly for the triple (k f a) g b. So we can compute g b before f a or vice versa, and this is what makes pure functional programming maximally granular for parallel execution. However, if in the imperative g( b ) modifies a, which is not allowed in Haskell, then order of execution of f( a ) matters. The order in which an imperative global variable is modified is forward statement execution order from the main program trunk down to each branch tip of nested functions, is inverted by the State Monad to a variable that is a returned (joined with the normal return value by unit) from each function starting from the Haskell right-most branch tip. Note in prior sentence replace 'variable' with 'computation' for the general monad. Actually the inversion is only a visual syntax, as we showed above that a = 3 occurs right-most in the equivalent Haskell syntax. As shown above that although the bind operator is associative, the State Monad eliminates the opportunities for out-of-order execution, but only for those functions which access the its state, because an attempt to compute such a function will force the cascade of computation of each return value in order from the right-most nested branch tip. Thus the stateful monads are a general model for fine grained interoperability between pure referentially transparent, lazy-order functional code and imperatively ordered stateful code. Note there are also non-stateful uses of the general monad interface[8].


[1]In theory, I reason that Haskell's compiler could automatically invert the nested recursion to forward iteration from 0th element to requested element of the list, i.e. computes fibs !! 0, fibs !! 1, fibs !! 2, etc., because the recursion direction is generally known to be reverse on a sequential (aka non-random) access linked-list for any side-effect free algorithm that composes exclusively from portions of itself. Note that lists do not have negative indices and side-effects can not occur in pure functional.

[2]Theoretically at a high level of abstract reasoning, Haskell's thunks need not do more operations[1] than the imperative example which reloads its suspended state when it resizes its fibonacci array (assuming single-threaded optimization comparison, else the imperative example needs more operations added to it to support multi-threading). In terms of speed for cached thunks, I reason that theoretically the test for the existence of a thunk, given a subexpression that is an element of a linked-list, can be abstractly equivalent to a single assembly conditional instruction test for magic value (e.g. null or pointer to update subroutine) on read of the next element pointer in the linked list-- an operation which non-lazy languages also require to detect termination of a finite list or loop. Note I am not refering to update-in-place.

[3]John Hughes (1984). "Why Functional Programming Matters" (3 §2, "An Analogy with Structured Programming")

[4]John Hughes (1984). "Why Functional Programming Matters" (8 §4, "Glueing Programs Together")

[5]Hudak, Hughes, Peyton Jones, Wadler (2007). "A History of Haskell: being lazy with class" (32 §10.3, "Controlling evaluation order")

[6]Hudak, Hughes, Peyton Jones, Wadler (2007). "A History of Haskell: being lazy with class" (32 §10.2, "Space profiling")

[7]Wadler (1995). "Monads for functional programming" (§2.8, "Variation two, revisited: State")

[8]Hughes (2005). "Programming with Arrows", in 5th International Summer School on Advanced Functional Programming, LNCS 3622, pp 73-129, Springer, 2005 (§1.3, "Arrows as computations")