Introduction to Prolog

Prolog is the Fortran of logic programming: it was the first of its kind, it was much criticised, and it is still widely used.

George Santayana: "Those who do not remember the past are condemned to repeat it."

Tim Menzies: "Those who do not study Prolog are condemned to repeat it."

The Standard "Family" Database Example

(Adapted from Robert F. Rossa: rossa@csm.astate.edu).

Prolog programs as databases

We show below a Prolog program consisting entirely of simple clauses or facts. This set of facts is a database about a family. Note that all the gender facts are grouped together, then all the father facts, then all the mother facts. Many versions of Prolog demand that similar clauses be grouped together.

/* family.pl - a simple database for a family */
gender(al,male).
gender(anne,female).
gender(arthur,male).
gender(amber,female).
gender(boris,male).
gender(barbara,female).
gender(betty,female).
gender(buck,male).
gender(carla,female).
gender(charles,male).
gender(cora,female).
gender(cuthbert,male).
gender(cecilia,female).
gender(calvin,male).

father(arthur,buck).
father(al,buck).
father(amber,buck).
father(anne,boris).
father(barbara,charles).
father(betty,cuthbert).
father(boris,charles).
father(buck,calvin).

mother(al,barbara).
mother(anne,betty).
mother(arthur,barbara).
mother(amber,barbara).
mother(boris,carla).
mother(barbara,carla).
mother(betty,cora).
mother(buck,cecilia).

We can load this database into a Prolog interpreter. Below we show how to do this using BinProlog 4.00, which is available on our UNIX system scott now and should be available on other systems later. In fact, you can get Windows95 and DOS versions from scott. We named our Prolog database "family.pl".

scott.astate.edu> bp

BinProlog 4.00 Copyright (C) Paul Tarau 1992-1995
by FTP from:   clement.info.umoncton.ca
E-MAIL to  : binprolog@info.umoncton.ca
(C-ified standalone)
(with heap GC enabled)
Finished loading system C-code (23203 instructions).
Finished loading user C-code (4 instructions).
?- [family].
compiling(to(mem),family.pl,...)
bytes_used(code(2208),strings(122),symbols(152),htable(1536),total(4018))
compile_time(183)
?-

The code to load family.pl was just [family]. Now the interpreter is waiting for us to query it. Every query must end with a period. When we query the interpreter, it will give us the first answer it finds, based on the facts in the database. If we want more answers, we enter a semicolon. If we don't want any more answers to that query, we just enter a carriage return. Thus consider the sequence that follows.

?- father(X,Y).
X=arthur,
Y=buck;

X=al,
Y=buck;

X=amber,
Y=buck;

X=anne,
Y=boris;

X=barbara,
Y=charles;

X=betty,
Y=cuthbert;

X=boris,
Y=charles;

X=buck,
Y=calvin

no
?-

The "no" simply indicates that there were no more answers. Prolog will answer "yes" or "no" depending on the truth or falsehood of a simple query. Thus:

?- mother(al,betty).
no
?- mother(al,barbara).
yes
?- 
We can build more complex queries like the following, to find a grandmother of al.
?- mother(al,X),mother(X,Y).
X=barbara,
Y=carla

yes
?-
By now you have observed that identifiers that start with an upper case letter are variables, and those that start with lower case letters are constants.

Rules

We can add some intelligence to our facts by adding a clause that tells how to figure out from the simple clauses whether Y is a parent of X. The rules we add are
parent(X,Y) :- father(X,Y).
parent(X,Y) :- mother(X,Y).
The symbol ':-' should be read as "if". Now we can query as follows.
?- parent(al,Y).
Y=buck;

Y=barbara

yes
?-
We can now build a rule that uses the parent rules.
grandfather(X,Y) :- parent(X,Z),father(Z,Y).
and ask questions like
?- grandfather(anne,Y).
Y=charles;

Y=cuthbert;

no
?- 

The scope of variables is one clause; there is no such thing as a global variable.

Backtracking

When Prolog tries to answer a query, it goes through a matching process. It tries to match the predicates on the right-hand side of the rule from left to right, binding variables when it finds a match. If a match fails, it may be because of the bondings in matches to the left. In that case, the interpreter backtracks - it returns to the nearest previous binding and tries to find a different match. If that fails, it backtracks some more.

We can trace what is going on. In the next sequence, we are going to consult our file family.pl instead of compiling it. This will allow the interpreter to see the text of the original rules.

?- consult(family).
% using compile/1 is MUCH faster
consulting(family.pl)
consulted(family.pl,time(50))
yes
?-  
Now we can use the builtin trace predicate to see what happens as one of our rules is matched.
?- trace(grandfather(al,Daddio)).
Call: grandfather(al,_2409) : -
 !!! dynamic(grandfather/2)
 Call: parent(al,_2959) : -
  !!! dynamic(parent/2)
  Call: father(al,_2959) : -
   !!! dynamic(father/2)
  Exit: father(al,buck)
 Exit: parent(al,buck)
 Call: father(buck,_2409) : -
  !!! dynamic(father/2)
 Exit: father(buck,calvin)
Exit: grandfather(al,calvin)
Daddio=calvin;

Redo: grandfather(al,calvin)
 Redo: father(buck,calvin)
 Fail: father(buck,_2409)
 Redo: parent(al,buck)
  Redo: father(al,buck)
  Fail: father(al,_2959)
  Call: mother(al,_2959) :-
   !!! dynamic(mother/2)
  Exit: mother(al,barbara)
 Exit: parent(al,barbara)
 Call: father(barbara,_2409) : -
  !!! dynamic(father/2)
 Exit: father(barbara,charles)
Exit: grandfather(al,charles)
Daddio=charles;

Redo: grandfather(al,charles)
 Redo: father(barbara,charles)
 Fail: father(barbara,_2409)
 Redo: parent(al,barbara)
  Redo: mother(al,barbara)
  Fail: mother(al,_2959)
 Fail: parent(al,_2959)
Fail: grandfather(al,_2409)
no
?-     

Notice in the last sequence that when the interpreter runs out of facts of the form father(barbara,_), it tries to find facts of the form mother(barbara,_) to satisfy the parent predicate. Note that rules in the database are tried in the order they appear in the database.

We can make Prolog print out all the answers to a question by deliberately making it fail, using the fail predicate.

?- grandfather(X,Y),write('X='),write(X),write(', Y='),write(Y),nl,fail.
X=arthur, Y=calvin
X=al, Y=calvin
X=amber, Y=calvin
X=anne, Y=charles
X=al, Y=charles
X=anne, Y=cuthbert
X=arthur, Y=charles
X=amber, Y=charles
no
?- 

Lists and Partial Structures

Prolog lists have heads and tails (think "car" and "cdr").

member(A, [A|_]).
member(A, [_|B]) :-
        member(A, B).

So, of course,...

?- member(a,[c,b,a]).
true

But Prolog can bactrack to find all repeated members of a list:

?-    member(a,[c,a,b,a]).
true ;
true 

Also, we can find all members of a list:

?- member(X,[a,b,c]).
X = a;
X = b;
X = c

Also, Prolog is a relational language. Variables are not inputs and outputs but concepts woven together by constraints. Also, we only need to partially specify the I/O to query those constraints:

?-  member(name=What, [age=20,shoesize=12,name=tim]).
What = tim .

(If this is causing stress, just chant to yourself: in Prolog, we don't code; rather, we draw the shape of the solution.)

This can get really funky. Here's the standard definition of how to append two lists:

append([], A, A).
append([A|B], C, [A|D]) :-
        append(B, C, D).

Which works in the expected way:

?- append([a,b,c],[d,e,f],Out).
Out = [a, b, c, d, e, f].

But any of these arguments can be left unspecified to achieve neat results:

; find the third thing in a list
?- append([ _, _, Third], _, [alice,john,mike,tim,veronica,wendy]).
Third = mike

; find the list before "tim"
?- append(Before, [tim | _], [alice,john,mike,tim,veronica,wendy]).
Before = [alice, john, mike]

; find the list after "tim"
?- append(_, [tim | After], [alice,john,mike,tim,veronica,wendy]).
After = [veronica, wendy]

You can even write "member" using "append":

member1(X,L) :- append(_,[X|_],L).

?- member1(a,[b,c,a]).
true .

?- member1(X,[b,c,a]).
X = b ;
X = c ;
X = a ;
fail.

Now with great power comes great responsbility. If you ask find all lists which can contain 'a', you can get an infinite set (of course).

?- member(a,L).
L = [ a |_G246] ;
L = [_G245,  a |_G249] ;
L = [_G245, _G248,  a |_G252] ;
L = [_G245, _G248, _G251,  a |_G255] ;
L = [_G245, _G248, _G251, _G254,  a |_G258] ;
L = [_G245, _G248, _G251, _G254, _G257,  a |_G261] ;
L = [_G245, _G248, _G251, _G254, _G257, _G260,  a |_G264] ;
L = [_G245, _G248, _G251, _G254, _G257, _G260, _G263,  a |_G267] ;

Lesson: logic, like anything else, must be used wisely.

Monkeys and Bananas

(Written by yours truly.)

Prolog variables support generalizations. For example, the following monkeys and bananas solution supports climbing onto a box anywhere in the room as well as push anything to anywhere.

It uses the following database:

% making Move in State1 results in State2;
% move( State1, Move, State2)

% representing the current state
% state( MonkeyHorizontal, MonkeyVertical, BoxPosition, HasBanana)

Our goal is to get the banana.

goal(state(_,_,_,has)).

Our start state...

start(state(atfloor,onfloor,atwindow,hasnot)).

This is how we can get from state1 to state2:

canget( S,[]) :- goal(S).             % base case
canget( State1,[Act|Rest])  :-        % general case
  move( State1, Act, State2),         % Do something
  canget( State2,Rest).               % Repeat

The knowledge base:

move( state( middle, onbox, middle, hasnot),   % Before move
      grasp,                                   % Grasp banana
      state( middle, onbox, middle, has) ).    % After move
move( state( P, onfloor, P, H),
      climb,                                   % Climb box
      state( P, onbox, P, H) ). 
move( state( P1, onfloor, P1, H),
      push( P1, P2),                           % Push box from P1 to P2
      state( P2, onfloor, P2, H) ).
move( state( P1, onfloor, B, H),
      walk( P1, P2),                           % Walk from P1 to P2
      state( P2, onfloor, B, H) ).

Main program:

main :- start(S0), canget(S0,  Steps),
        forall(member(Step,Steps), (print(Step),nl)).

Output:

?- main.
walk(atfloor, atwindow)
push(atwindow, middle)
climb
grasp
true ;

We pause here (even though we asked PROLOG to backtrack and find more solutions with the ";" command) to comment that we we got to the bananas.

Moving on, as we backtrack, we see that the solution gets longer. This is not a bug. What is going on?

walk(atfloor, atwindow)
push(atwindow, _G251)
push(_G251, middle)
climb
grasp
true ; % new solution found

walk(atfloor, atwindow)
push(atwindow, _G251)
push(_G251, _G262)
push(_G262, middle)
climb
grasp
true ; % another solution found

walk(atfloor, atwindow)
push(atwindow, _G251)
push(_G251, _G262)
push(_G262, _G273)
push(_G273, middle)
climb
grasp
true . % yet another solution found

Hmm.... looks like we need to control how far the money walks.

Search control

(Adapted from John Fisher's excellent Prolog tutorial.)

Prolog's hardwired search is depth-first through its list of clauses. But this can easily be changed. For example, in this section we will code up DFID in Prolog.

Simple DFID

For our first DFID, we control the size of the list that shows the steps taken to reach the goal. We can build that leash using the following trick:

?-  between(1,5,N), length(List, N).
N = 1,
List = [_G292] ;
N = 2,
List = [_G292, _G295] ;
N = 3,
List = [_G292, _G295, _G298] ;
N = 4,
List = [_G292, _G295, _G298, _G301] ;
N = 5,
List = [_G292, _G295, _G298, _G301, _G304].

We will use this trick to define dfid as a loop around the monkey's "canget" predicate:

dfid(Max) :- 
	start(S0),
	between(1,Max,Depth), % on backtracking, Depth=1,2,3,...
	length(Steps,Depth),  % Steps is a list of size Depth
	print(leash(Steps)),nl,
	canget(S0,Steps),
	print(Depth),nl,
    forall(member(Step,Steps), (tab(4),print(Step),nl)).

For example:

?- dfid(5), fail. % backtrack through all solutions.
leash([_G254])
leash([_G254, _G257])
leash([_G254, _G257, _G260])
leash([_G254, _G257, _G260, _G263])

4
    walk(atfloor, atwindow)
    push(atwindow, middle)
    climb
    grasp

leash([_G254, _G257, _G260, _G263, _G266])

5
    walk(atfloor, atwindow)
    push(atwindow, _G280)
    push(_G280, middle)
    climb
    grasp
5
    walk(atfloor, atwindow)
    push(atwindow, middle)
    walk(middle, middle)
    climb
    grasp
5
    walk(atfloor, _G272)
    walk(_G272, atwindow)
    push(atwindow, middle)
    climb
    grasp

Meta-interpreters

A more generic version of the above can be build using meta-interpreters.

Programs can also be considered as input data for other programs. Prolog programs are sequences of prolog terms, so prolog programs easily serve as input data. A prolog meta-interpreter uses program data as a basis for additional computations. In this section, several prolog meta-interpreters are discussed that modify the computation of prolog goals.

Simple Prolog meta-interpreter

The following program specifies an interpreter of clause trees in Prolog

clause_tree(true) :- !.      /* true leaf */ 
clause_tree((G,R)) :- !,
   clause_tree(G),
   clause_tree(R).           /* search each branch */ 
clause_tree(G) :- 
   clause(G,Body),
   clause_tree(Body).        /* grow branches */ 

To illustrate how this meta-interpreter works, let us use it to interpret the program for 'member':

member(X,[X|_]). 
member(X,[_|R]) :- member(X,R). 

Consider the goal

?- clause_tree(member(X,[a,b,c])). 
X = a ; 
X = b ; 
X = c ; 
no 

So the meta-interpreter grows clause trees, and finds the same answers that Prolog itself would calculate.

Using evaluation for built-in goals

One can add evaluation to the 'clause_tree' program. The way this is done depends upon the kind of Prolog one is using.

clause_tree(true) :- !.
clause_tree((G,R)) :-
   !,
   clause_tree(G), 
   clause_tree(R).
clause_tree(G) :- 
   (predicate_property(G,built_in) ;  
     predicate_property(G,compiled) ), 
   call(G).              %% let Prolog do it
clause_tree(G) :- clause(G,Body), clause_tree(Body). 
The new third clause says that if the goal G is built_in (e.g., arithmetic) then call the underlying Prolog goal to do the evaluation; and similarly if G is compiled into memory. So, for example, with the new definition
?- clause_tree((X is 3, X <  5)). 
X = 3 

Loop checking

Meta-interpreters are very useful for redesigning the control mechanisms of Prolog. For example, consider the following program:

p :- q. 
q :- p. 
p :- r. 
r. 

If one were to try the Prolog goal ?- p the first two clauses would be the cause of infinite looping, like in the following derivation

However, r, p, and q are all consequences of the program. The fact that Prolog cannot compute these consequences is an example of the incompleteness of Prolog. For the sake of general programming efficiency, Prolog does not try to detect any loops. There is also a theoretical limitation: There is no general loop-detecting algorithm that could be applied to Prolog that would succeed in detecting all loops. If there were, one would have, in effect, solved the halting problem.

However, it is still a very interesting problem to try to detect some loops. Here is a modification of the 'clause_tree/1' program that can detect some loops:

clause_tree(true,_) :- !. 
clause_tree((G,R),Trail) :-
   !, 
   clause_tree(G,Trail),
   clause_tree(R,Trail). 
clause_tree(G,Trail) :-
   loop_detect(G,Trail),
   !, 
   fail.
clause_tree(G,Trail) :- 
   clause(G,Body), 
   clause_tree(Body,[G|Trail]).

loop_detect(G,[G1|_]) :- G == G1.
loop_detect(G,[_|R])  :- loop_detect(G,R). 

We have added a Trail parameter to the meta-interpreter, and a 'loop_detect'. It is instructive to compare this program with the search program in section 2.13. Now, we have

?- clause_tree(p,[]). 

yes 

The third clause "catches" the loop and allows the other choice 'p :- r' to be tried.

Iterative deepening

To motivate the last topic in this section, consider the following "bad" program

connected(X,Y) :- connected(X,Z), connected(Z,Y).
connected(1,2).
connected(2,3).
connected(3,4).
connected(4,5).

If one were to try to compute 'connected' using Prolog, there is going to be a problem with "left recursion". For example,

?- connected(1,2).
 ...

One method for avoiding this kind of infinite descent is called "iterative deepening", which means that one still uses depth-first search, but one "iteratively" searches to a certain depth, then deeper, then deeper still, ..., etc. Here is a meta-interpreter that does this. This meta-interpreter is quite similar to the previous ones. However, this one has only two extra parameters: one for current depth of a goal, and one for the current depth limit. This interpreter does not do the previous kind of loop check nor does it generate the symbolic tree, but it does call Prolog for evaluation goals. (Of course, these features could be easily added, but here we concentrate just on the iterative deepening concept.)

clause_tree4(true,_) :- !.  % not necessary. "true" is a built-in
clause_tree4((A,B),D) :- !, 
    clause_tree4(A,D), 
    clause_tree4(B,D).
clause_tree4(A,_) :- 
	predicate_property(A,built_in),
    !,
   	call(A).
clause_tree4(A,D) :- 
	D > 0,
    clause(A,B),
    clause_tree4(B,D - 1).

iterative_deepening(G,Max) :-
	between(1,Max,D), 	
	clause_tree4(G,D),
	print(foundAt(D)),nl.

Note that the intended interpreter is 'iterative_deepening(+Goal,+InitDepth)' where +Goal could, of course, contain variables. This iterative deepening meta-interpreter uses "stages": the depth of the first stage is +InitDepth, the second stage goes to depth InitDepth+5, etc.

So now, suppose that this program and the original 'connected' program have been loaded. What happens this time? ...

?- iterative_deepening(connected(1,What),10),fail.
foundAt(1)
foundAt(2)
foundAt(2)
foundAt(3)
foundAt(3)
foundAt(3)
foundAt(3)
foundAt(3)
foundAt(4)
foundAt(4)
foundAt(4)
foundAt(4)
foundAt(4)
foundAt(4)
foundAt(4)
foundAt(4)
foundAt(4)
foundAt(5)
foundAt(5)
foundAt(5)
foundAt(5)
foundAt(5)
foundAt(5)
foundAt(5)
foundAt(5)
foundAt(5)
foundAt(6)
foundAt(6)
foundAt(6)
foundAt(6)
foundAt(6)
foundAt(6)
foundAt(6)
foundAt(6)
foundAt(6)
foundAt(7)
foundAt(7)
foundAt(7)
foundAt(7)
foundAt(7)
foundAt(7)
foundAt(7)
foundAt(7)
foundAt(7)
foundAt(8)
foundAt(8)
foundAt(8)
foundAt(8)
foundAt(8)
foundAt(8)
foundAt(8)
foundAt(8)
foundAt(8)
foundAt(9)
foundAt(9)
foundAt(9)
foundAt(9)
foundAt(9)
foundAt(9)
foundAt(9)
foundAt(9)
foundAt(9)
foundAt(10)
foundAt(10)
foundAt(10)
foundAt(10)
foundAt(10)
foundAt(10)
foundAt(10)
foundAt(10)
foundAt(10)

Theoretically, iterative deepening has a kind of optimal behavior for "blind" search: Iterative deepening will find any possible solution in a stage deep enough to include the clause tree justifying that solution -- space O(d), d=depth -- in time O(b**d), b= average branching factor. Other kinds of complete search, such as breadth-first (or iterative broadening), have to search through a larger expected number of nodes.