Functional Programming

Topics:

Overview and motivation of functional languages
Recursion over lists, natural numbers, trees, and other recursively-defined data
Closures and uses of functions as data (infinite sets, streams)

Learning Objectives:

Outline the strengths and weaknesses of the functional programming paradigm.
Design, code, test, and debug programs using the functional paradigm.
Sources (and further reading):
1. Scott, chapter 10
2. http://www.ocaml-tutorial.org/functional_programming
3. Paul Graham's article: The Roots of Lisp
What is functional programming?
A functional program consists of an expression E (representing both the algorithm and the input). This expression E is subject to some rewrite rules. Reduction consists of replacing some part P of E by another expression P' according to the given rewrite rules. ... This process of reduction will be repeated until the resulting expression has no more parts that can be rewritten. The expression E* thus obtained is called the normal form of E and constitutes the output of the functional program -H. P. Barendregt
Functional programming is characterized by the programming with values, functions and functional forms.
Keywords and phrases: Lambda calculus, free and bound variables, scope, environment, functional programming, combinatorial logic, recursive functions, functional, curried function.
Functional programming languages are the result of both abstracting and generalizing the data type of maps. Recall, the mapping m from each element x of S (called the domain) to the corresponding element m(x) of T (called the range) is written as:
m : S --> T
For example, the squaring function is a function of type:
sqr : Num --> Num
and may be defined as:
sqr where x |--> x*x
A linear function f of type
f : Num --> Num
may be defined as:
f where x |--> 3*x + 4

Functional programming is based on the mathematical concept of a function and functional programming languages include the following:
- A set of primitive functions.
- A set of functional forms.
- The application operation.
- A set of data objects and associated functions.
- A mechanism for binding a name to a function.
LISP, FP, Scheme, ML, Miranda and Haskell are just some of the languages to implement this elegant computational paradigm.
The basic concepts of functional programming originated with LISP.
Why Bother?
Functional programming languages are important for the following reasons.
- Functional programming dispenses with the assignment command freeing the programmer from the rigidly sequential mode of thought required with the assignment command.
- Functional programming encourages thinking at higher levels of abstraction by providing higher-order functions -- functions that modify and combine existing programs.
- Functional programming has natural implementation in concurrent programming.
- Functional programming has important application areas. Artificial intelligence programming is done in functional programming languages and the AI techniques migrate to real-world applications.
- Functional programming is useful for developing executable specifications and prototype implementations.
- Functional programming has a close relationship to computer science theory. Functional programming is based on the lambda-calculus which in turn provides a framework for studying decidability questions of programming. The essence of denotational semantics is the translation of conventional programs into equivalent functional programs.
Reduction Order
Given a lambda-expression, the substitution and beta-reduction rules provide the tools required to reduce a lambda-expression to normal form but do not tell us what order to apply the reductions when more than one redex is avaliable. The following theorem, due to Curry, states that if an expression has a normal form, then that normal form can be found by leftmost reduction.

Theorem: If E has a normal form N then there is a leftmost reduction of E to N.
The leftmost outermost reduction (normal order reduction) strategy is called lazy reduction because it does not first evaluate the arguments but substitutes the arguments directly into the expression. Eager reduction is when the arguments are reduced before substitution.
A function is strict if it is sure to need its argument. If a function is non-strict, we say that it is lazy.
LISP and Scheme evaluates its parameters before passing (eliminates need for renaming) a space and time efficiency consideration.
Lisp

(This article comes from Paul Graham's excellent article The Roots of Lisp ).
In 1960, John McCarthy published a remarkable paper: "Recursive Functions of Symbolic Expressions and Their Computation by Machine Part I" Communications of the ACM 3:4 April 1960 pp184-195
In this paper, he did for programming something like what Euclid did for geometry. He showed how, given a handful of simple operators and a notation for functions, you can build a whole programming language. He called this language Lisp, for "List Processing," because one of his key ideas was to use a simple data structure called a list for both code and data.
It's worth understanding what McCarthy discovered, not just as a landmark in the history of computers, but as a model for what programming is tending to become in our own time. It seems to me that there have been two really clean, consistent models of programming so far: the C model and the Lisp model. These two seem points of high ground, with swampy lowlands between them. As computers have grown more powerful, the new languages being developed have been moving steadily toward the Lisp model. A popular recipe for new programming languages in the past 20 years has been to take the C model of computing and add to it, piecemeal, parts taken from the Lisp model, like runtime typing and garbage collection.
In this representation, a function is expressed as (lambda (p1 ... pn) e) where "p" is an atom (called parameters) and "e" is an expressions. An expression whose first element is such an expression
```
((lambda (p1 ... pn) e) a1... an)
```
is called a function call and its value is computed as follows. Each expression "a" is evaluated. Then "e" is evaluated. During the evaluation of "e" the value of any occurrence of one of the "p_i" is the value of the corresponding "a_i" in the most recent function call. For example:
```
> ((lambda (x) (cons x '(b))) 'a) 
(a b) 

> ((lambda (x y) (cons x (cdr y)))
    'z
     '(a b c))
(z b c) 
```
There is another notation for functions that enables the function to refer to itself, thereby giving us a convenient way to define recursive functions. The notation
```
(label f (lambda (p1 .. pn) e))
```
denotes a function that behaves like (lambda (p1 .. pn) e), with the additional property that an occurrence of "f" within e will evaluate to the label expression as if f were a parameter of the function.
For example, suppose we want to define a function (subst x y z) which takes an expression "x" an atom "y" and a list "z" and returns a list like "z" but with each instance of "y" at any depth of nesting in "z" replaced by "x".
```
>
(subst 'm 'b '(a b (a b c) d))
(a m (a m c) d)
```
We can denote this function as
```
(label subst (lambda (x y z)
				(cond ((atom z)
				       (cond ((eq z y) x)
			                 ('t z)))
                      ('t (cons (subst x y (car z))
                                (subst x y (cdr z)))))))
```
We will abbreviate f = (label f (lambda (p1 ... pn pn) e)) as
```
(defun f (p1 ... pn) e)
```
so
```
(defun subst (x y z)
   (cond ((atom z)
          (cond ((eq z y) x)
                ('t z)))
         ('t (cons (subst x y (car z))
                   (subst x y (cdr z)))))))
```
Here's the original 1960 LISP interpreter, written in LISP.
```
; The Lisp defined in McCarthy's 1960 paper, translated into CL.
; Assumes only quote, atom, eq, cons, car, cdr, cond.
; Bug reports to lispcode@paulgraham.com.

(defun null. (x)
  (eq x '()))

(defun and. (x y)
  (cond (x (cond (y 't) ('t '())))
        ('t '())))

(defun not. (x)
  (cond (x '())
        ('t 't)))

(defun append. (x y)
  (cond ((null. x) y)
        ('t (cons (car x) (append. (cdr x) y)))))

(defun list. (x y)
  (cons x (cons y '())))
```
These next couple are going to take a little explaining.
(pair x y) takes two lists of the same length and returns a list of two-element lists containing successive pairs of an element from each.
```
(defun pair. (x y)
  (cond ((and. (null. x) (null. y)) '())
        ((and. (not. (atom x)) (not. (atom y)))
         (cons (list. (car x) (car y))
               (pair. (cdr x) (cdr y))))))
```
(assoc x y) takes an atom x and a list y of the form created by pair.. and returns the second element of the first list in y whose first element is x.
```
(defun assoc. (x y)
  (cond ((eq (caar y) x) (cadar y))
        ('t (assoc. x (cdr y)))))
```
This next one is needed to evaluate cond statements It works its way through the cluases recursively, looking for the one in which the first element returns "t" (the it returns the value of the second element).
```
(defun evcon. (c a)
  "eval (cond ((a b) (c d) ...))"
  (cond ((eval. (caar c) a)
         (eval. (cadar c) a))
        ('t (evcon. (cdr c) a))))
```
There is a lot of nested list accessing in the above. In summary:
```
(caar x)        = (car (car x))                  
(cdar x)        = (cdr (car x))                   
(cadar x)       = (car (cdr (car x)))           
(caddr x)       = (car (cdr (cdr x)))          
(caddar x)      = (car (cdr (cdr (car x))))   
```
So much for the primitives. Here's the main engine. It takes two arguments: "e" is the expression to be evaluated and "a" are the atoms known in the current envrionment; (and was generated by the pair. function). All the variables and all the pre-defined functions are stored in that second argument.
```
(defun eval. (e a)
  (cond
    ((atom e) (assoc. e a))
    ((atom (car e))
     (cond
       ((eq (car e) 'quote) (cadr e))
       ((eq (car e) 'atom)  (atom   (eval. (cadr e) a)))
       ((eq (car e) 'eq)    (eq     (eval. (cadr e) a)
                                    (eval. (caddr e) a)))
       ((eq (car e) 'car)   (car    (eval. (cadr e) a)))
       ((eq (car e) 'cdr)   (cdr    (eval. (cadr e) a)))
       ((eq (car e) 'cons)  (cons   (eval. (cadr e) a)
                                    (eval. (caddr e) a)))
       ((eq (car e) 'cond)  (evcon. (cdr e) a))
       ('t (eval. (cons (assoc. (car e) a)
                        (cdr e))
                  a))))
    ((eq (caar e) 'label)
     (eval. (cons (caddar e) (cdr e))
            (cons (list. (cadar e) (car e)) a)))
    ((eq (caar e) 'lambda)
     (eval. (caddar e)
            (append. (pair. (cadar e) (evlis. (cdr e) a))
                     a)))))

(defun evlis. (m a)
  (cond ((null. m) '())
        ('t (cons (eval.  (car m) a)
                  (evlis. (cdr m) a)))))
```
The most complex part of the above is the lambda call. So
```
(eval.  '(f '(b c))
        '((f (lambda (x) (cons 'a x)))))
```
turns into
```
(eval. '((lambda (x) (cons 'a x)) '(b c))
        '((f lambda (x) (cons 'a x))))
```
which returns (a b c).
Now that we understand how eval works, let's step back and consider what it means. What we have here is a remarkably elegant model of computationż Using just quote, atom, eq, car, cdr, cons, and cond, we can define a function eval, that actually implements our language, and then using that we can define any additional function we want.
There were already models of computation, of course- most notably the Turing Machine. But Turing Machine programs are not very edifying to read. If you want a language for describing algorithms. you might want something more abstract. and that was one of McCarthy's aims in defining Lisp.
The language he defined in 1960 was missing a lot. It has no side-effects, no sequential execution (which is useful only with side effects anyway), no practical numbers, and dynamic scope. But these limitations can be remedied with surprisingly little additional code. Steele and Sussman show how to do it in a famous paper:
- Guy Lewis Steele, Jr. and Gerald Jay Sussman. The Art of the Interpreter of, the Modularity Complex (Parts Zero, One, and Two). MIT AI Lab. AI Lab Memo AIM-453. May 1978.
If you understand McCarthy's eval, you understand more than just a stage in the history of languages. These ideas are still the semantic core of Lisp today. So studying McCarthy's original paper shows us, in a sense, what Lisp really is. It's not something that McCarthy designed so much as something he discovered. It's not intrinsically a language for AI or for rapid prototyping, or any other task at that level. It's what you get (or one thing you get) when you try to axiomatize computation.
Over time, the median language, meaning the language used by the median programmer, has grown consistently closer to Lisp. So by understanding eval you're understanding what will probably be the main model of computation well into the future.
But we don't need to look too closely at the magic of all this. The main point is that here that functional languages like LISP are a natural tool for defining simple, yet incredibly powerful, recursive explorers of nested structures.
Scheme
Scheme, a descendent of LISP, is based on the lambda-calculus. Although it has imperative features, in this section we ignore those features and concentrate on the lambda-calculus like features of Scheme. Scheme has two kinds of objects, atoms and lists. Atoms are represented by strings of non-blank characters. A list is represented by a sequence of atoms or lists separated by blanks and enclosed in parentheses. Functions in Scheme are also represented by lists. This facilitates the creation of functions which create other functions. A function can be created by another function and then the function applied to a list of arguments. This is an important feature of languages for AI applications.
Syntax
The syntax of Scheme is similar to that of the lambda calculus.
Scheme Syntax
E in Expressions
A in Atoms ( variables and constants )
...
E ::= A | (E...) | (lambda (A...) E) | ...

Expressions are atoms which are variables or constants, lists of arbitrary length (which are also function applications), lambda-abstractions of one or more parameters, and other built-in functions.
Scheme provides a number of built in functions among which are +, -, *, /, <, <=, =, >=, >, and not. Scheme provides for conditional expressions of the form (if E₀ E₁ E₂) and (if E₀ E₁). Among the constants provided in Scheme are numbers, #f and the empty list () both of which count as false, and #t and any thing other than #f and () which count as true. nil is also used to represent the empty list.
Definitions
Scheme implements definitions with the following syntax
E ::= ...| (define I E) | ...

Lists with nil, cons, car and cdr
The list is the basic data structure with nil repesenting the empty list. Among the built in functions for list manipulation provided in Scheme are cons for attaching an element to the head of a list, car for extracting the first element of a list, and cdr which returns a list minus its first element. The following code contains an example of stack operations writtem in Scheme. The code illustrates definitions, the conditional expression, the list predicate null? for testing whether a list is empty, and the list manipulation functions cons, car, and cdr.
```
( define empty_stack 
   ( lambda ( stack ) ( if ( null? stack ) \#t \#f )))

( define push
   ( lambda ( element stack ) ( cons element stack ) ))

(define pop
   ( lambda ( element stack ) ( cdr stack )))

(define top
   ( lambda ( stack ) ( car stack )))
```
OCaml

LISP and Scheme are "old school". There is much more modern interest in F#, Haskell, and OCaml. Here's a OCaml example example:
```
# let double x =
    x * 2
  in
  List.map double [ 1; 2; 3 ];;
- : int list = [2; 4; 6]
```
The function double takes an argument x and returns x * 2. Then map calls double on each element of the given list ([1; 2; 3]) to produce the result: a list with each number doubled.

map is known as a higher-order function (HOF). Higher-order functions are just a fancy way of saying that the function takes a function as one of its arguments. The same thing in LISP would be:
```
(defun doubler ()
	#'(lambda (x) (* x 2)))

(mapcar (doubler) '(1 2 3))

==>
(2 4 6)
```
If you're familiar with C/C++ then this looks like passing a function pointer around. Java has some sort of abomination called an anonymous class which is like a dumbed-down, slow and long-winded closure. If you know Perl then you probably already know and use Perl's closures and Perl's map function, which is exactly what we're talking about. The fact is that Perl is actually quite a good functional language.

Closures are functions which carry around some of the "environment" in which they were defined. In particular, a closure can reference variables which were available at the point of its definition. Let's generalise the function above so that now we can take any list of integers and multiply each element by an arbitrary value n:
```
let multiply n list =
  let f x =
    n * x
  in
  List.map f list
  ;;
```
Hence:
```
# multiply 2 [1; 2; 3];;
- : int list = [2; 4; 6]
# multiply 5 [1; 2; 3];;
- : int list = [5; 10; 15]
```
The important point to note about the multiply function is the nested function f. This is a closure. Look at how f uses the value of n which isn't actually passed as an explicit argument to f. Instead f picks it up from its environment - it's an argument to the multiply function and hence available within this function.

This might sound straightforward, but let's look a bit closer at that call to map: List.map f list.

map is defined in the List module, far away from the current code. In other words, we're passing f into a module defined "a long time ago, in a galaxy far far away". For all we know that code might pass f to other modules, or save a reference to f somewhere and call it later. Whether it does this or not, the closure will ensure that f always has access back to its parental environment, and to n.

In LISP, the same example would be:
```
(defun multn (n) 
	#'(lambda (x) (* x n)))

(mapcar (multn 2) (1 2 3 4))
==>
(2 4 6 8)

(mapcar (multn 5) (1 2 3 4))
==>
(5 10 15 20)
```
As with the OCAML solution, the agument to multn gets stuck in a closure and is carried around the data structure.
The same technique in LISP is
```
(defparameter *grammar*
  '((Sentence -> Noun VerbPhrase )
    (VerbPrase -> runs quickly)
    (VerbPhase -> walks slowley)))

(defun lhs (g)
  (let (all)
    (mapc #'(lambda (one)
                 (setf all (lhs1 (first one) all)))
          g)
    all))

(defun lhs1 (item all)
  (if (member item all)
      all
      (cons item all)))

;e.g.
;(lhs *grammar*)
```
Partial function applications and currying

Let's define a plus function which just adds two integers:
```
let plus a b =
  a + b
  ;;
```
Some questions for people asleep at the back of the class.
1. What is plus?
2. What is plus 2 3?
3. What is plus 2?
Question 1 is easy. plus is a function, it takes two arguments which are integers and it returns an integer. We write its type like this:
```
plus : int -> int -> int
```
Question 2 is even easier. plus 2 3 is a number, the integer 5. We write its value and type like this:
```
5 : int
```
But what about question 3? It looks like plus 2 is a mistake, a bug. In fact, however, it isn't. If we type this into the OCaml toplevel, it tells us:
```
# plus 2;;
- : int -> int = <fun>
```
This isn't an error. It's telling us that plus 2 is in fact a function, which takes an int and returns an int. What sort of function is this? We experiment by first of all giving this mysterious function a name (f), and then trying it out on a few integers to see what it does:
```
# let f = plus 2;;
val f : int -> int = <fun>
# f 10;;
- : int = 12
# f 15;;
- : int = 17
# f 99;;
- : int = 101
```
In engineering this is sufficient proof by example for us to state that plus 2 is the function which adds 2 to things.

Going back to the original definition, let's "fill in" the first argument (a) setting it to 2 to get:
```
let plus 2 b =       (* This is not real OCaml code! *)
  2 + b
  ;;
```
You can kind of see, I hope, why plus 2 is the function which adds 2 to things.

Looking at the types of these expressions we may be able to see some rationale for the strange -> arrow notation used for function types:
```
  plus : int -> int -> int
  plus 2 : int -> int
plus 2 3 : int
```
This process is called currying (or perhaps it's called uncurrying, I never was really sure which was which). It is called this after Haskell Curry who did some important stuff related to the lambda calculus. Since I'm trying to avoid entering into the mathematics behind OCaml because that makes for a very boring and irrelevant tutorial, I won't go any further on the subject. You can find much more information about currying if it interests you by doing a search on Google.

Remember our double and multiply functions from earlier on? multiply was defined as this:
```
let multiply n list =
  let f x =
    n * x
  in
  List.map f list
  ;;
```
We can now define double, triple &c functions very easily just like this:
```
let double = multiply 2;;
let triple = multiply 3;;
```
They really are functions, look:
```
# double [1; 2; 3];;
- : int list = [2; 4; 6]
# triple [1; 2; 3];;
- : int list = [3; 6; 9]
```
You can also use partial application directly (without the intermediate f function) like this:
```
# let multiply n = List.map (( * ) n);;
val multiply : int -> int list -> int list = <fun>
# let double = multiply 2;;
val double : int list -> int list = <fun>
# let triple = multiply 3;;
val triple : int list -> int list = <fun>
# double [1; 2; 3];;
- : int list = [2; 4; 6]
# triple [1; 2; 3];;
- : int list = [3; 6; 9]
```
In the example above, ((*) n) is the partial application of the (*) (times) function. Note the extra spaces needed so that OCaml doesn't think (* starts a comment.

Note that LISP can be streched to handling currying. But there is something fundamentally more elegant about OCAML (no annoying brackets!). Also, there is more current work on optimizing modern functional langauges that old horses like LISP:

Source: The Great Language Shootout

What is functional programming good for?

Functional programming, like any good programming technique, is a useful tool in your armoury for solving some classes of problems. It's very good for callbacks, which have multiple uses from GUIs through to event-driven loops. It's great for expressing generic algorithms. List.map is really a generic algorithm for applying functions over any type of list. Similarly you can define generic functions over trees. Certain types of numerical problems can be solved more quickly with functional programming (for example, numerically calculating the derivative of a mathematical function).

Pure and impure functional programming

A pure function is one without any side-effects. A side-effect really means that the function keeps some sort of hidden state inside it. strlen is a good example of a pure function in C. If you call strlen with the same string, it always returns the same length. The output of strlen (the length) only depends on the inputs (the string) and nothing else. Many functions in C are, unfortunately, impure. For example, malloc - if you call it with the same number, it certainly won't return the same pointer to you. malloc, of course, relies on a huge amount of hidden internal state (objects allocated on the heap, the allocation method in use, grabbing pages from the operating system, etc.).

ML-derived languages like OCaml are "mostly pure". They allow side-effects through things like references and arrays, but by and large most of the code you'll write will be pure functional because they encourage this thinking. Haskell, another functional language, is pure functional. OCaml is therefore more practical because writing impure functions is sometimes useful.

There are various theoretical advantages of having pure functions. One advantage is that if a function is pure, then if it is called several times with the same arguments, the compiler only needs to actually call the function once. A good example in C is:
```
for (i = 0; i < strlen (s); ++i)
  {
    // Do something which doesn't affect s.
  }
```
If naively compiled, this loop is O(n²) because strlen (s) is called each time and strlen needs to iterate over the whole of s. If the compiler is smart enough to work out that strlen is pure functional and that s is not updated in the loop, then it can remove the redundant extra calls to strlen and make the loop O(n). Do compilers really do this? In the case of strlen yes, in other cases, probably not.

Functional Programming

What is functional programming?

Why Bother?

Reduction Order

Lisp

Scheme

Syntax

Definitions

Lists with nil, cons, car and cdr

OCaml

Partial function applications and currying

What is functional programming good for?

Pure and impure functional programming

Lists with `nil, cons, car` and `cdr`