Topics:
Learning Objectives:
Note:
The ability to abstract and to generalize is an essential part of any intellectual activity. Abstraction and generalization are fundamental to mathematics and philosophy and are essential in computer science as well.
The importance of abstraction is derived from its ability to hide irrelevant details and from the use of names to reference objects. Programming languages provide abstraction through procedures, functions, and modules which permit the programmer to distinguish between what a program does and how it is implemented. The primary concern of the user of a program is with what it does. This is in contrast with the writer of the program whose primary concern is with how it is implemented. Abstraction is essential in the construction of programs. It places the emphasis on what an object is or does rather than how it is represented or how it works. Thus, it is the primary means of managing complexity in large programs.
Of no less importance is generalization. While abstraction reduces complexity by hiding irrelevant detail, generalization reduces complexity by replacing multiple entities which perform similar functions with a single construct. Programming languages provide generalization through variables, parameterization, generic and polymorphism. Generalization is essential in the construction of programs. It places the emphasis on the similarities between objects. Thus, it helps to manage complexity by collecting individuals into groups and providing a representative which can be used to specify any individual of the group.
Abstraction and generalization are often used together. Abstracts are generalized through parameterization to provide greater utility. In parameterization, one or more parts of an entity are replaced with a name which is new to the entity. The name is used as a parameter. When the parameterized abstract is invoked, it is invoked with a binding of the parameter to an argument.
Exercise: draw a mammal that is not a dog, cat, bear, human, etc...
Principle of Abstraction An abstract is a named entity which may be invoked by mentioning the name.
An object is said to be fully abstract if it can be understood without reference to any thing external to the object.
Terminology. The naming aspect of abstraction is captured in the concepts of binding, definition and declaration while the hiding of irrelevant details is captured by the concept of encapsulation. A binding is an association of two entities. A definition is a binding of a name to an entity, a declaration is a definition which binds a name to a variable, and an assignment is a binding of a value and a variable.We could equally well say identifier instead of name. A variable is an entity whose value is not fixed but may vary. Names are bound to variables in declaration statements. Among the various terms for abstracts found in other texts are module, package, library, unit, subprogram, subroutine, routine, function, procedure, abstract type, object.
A module system provides a way of writing large program so that the various pieces of the program don't interfere with on another because of name clashes and also provides a way of hiding implementation details. ... A module generally consists of two parts, the export part and the local part. The export part of a module consists of language declarations for the symbols available for use in either part of the module and in other modules which import them and module declaration giving the symbols from other modules which are available for use in either part of the module and in other modules which import them. The local part of a module consists of language declarations for the symbols available for use only in this part. TGPL-Hill and Lloyd
The work of constructing large programs is divided among several people, each of whom must produce a part of the whole. Each part is called a module and each programmer must be able to construct his/her module without knowing the internal details of the other parts. This is only possible when each module is is separated into an interface part and an implementation part. The interface part describes all the information required to use the module while the implementation part describes the implementation. This idea is already present in most programming languages in the manner in which functions and procedures are defined. Function and procedure definitions usually are separated into two parts. The first part gives the subprogram's name and parameter requirements and the second part describes the implementation. A module is a generalization of the concept of abstraction in that a module is permitted to contain a collection of definitions. An additional goal of modules is to confine changes to a few modules rather than throughout the program.
While the concept of modules is a useful abstraction, the full advantages of modules are gained only when modules may be written, compiled and possibly executed separately. In many cases modules should be able to be tested independently of other modules.
Advantages
A program has a monolithic block structure if it consists of just one block. This structure is typical of BASIC and early versions of COBOL. The monolithic structure is suitable only for small programs. The scope of every definition is the entire program. Typically all definitions are grouped in one place even if they are used in different parts of the program.
Global Data Return Address1 ... Return Addressn
A program has a flat block structure if it is partitioned into
distinct blocks, an outer all inclosing block one or more inner blocks
i.e., the body may contain additional blocks but the inner blocks may not
contain blocks. This structure is typical of FORTRAN and C. In these languages,
all subprograms (procedures and functions) are separate, and each acts
as a block. Variables can be declared inside a subprogram are then local
to that subprogram. Subprogram names are part of the outer block and thus
their scope is the entire program along with global variables. All subprogram
names and global variables must be unique. If a variable cannot be local
to a subprogram then it must be global and accessible by all subprograms
even though it is used in only a couple of subprograms.
A:
|
B:
|
... | N:
|
A program has nested block structure if blocks may be nested inside other blocks i.e., there is no restriction on the nesting of blocks within the body. This is typical of the block structure of the Algol-like languages. A block can be located close to the point of use. In blocks visibility is controlled by nesting. All names are visible (implicitly exported) to internally nested blocks. No names are visible (exported) to enclosing blocks. In a block, the only names visible are those that are declared in all enclosing blocks or are declared in the block, but not those declared in nested blocks.
A:
|
|
A global name is a name that when referenced within a block refers to a name declared outside the block.
An activation of a block is a time interval during which that block is being executed.
The three basic block structures are sufficient for what is called programming in the small (PITS). These are programs which are comprehensible in their entirety by an individual programmer. However, they are not general enough for very large programs. Large programs which are written by many individuals and which must consist of modules that can be developed and tested independently of other modules. Such programming is called programming in the large (PITL).
The scope rules for modules define relationships among the names within the partitions. There are four choices.
The copy mechanism requires values to be copied into an generic when it is entered and copied out of the generic when the generic is exited. This form of parameter passing is often referred to as passing by value. The formal parameters are local variables and the argument is copied into the local variable on entry to the generic and copied out of the local variable to the argument on exit from the generic.
The value parameter of Pascal and the in parameter of Ada are examples of parameters which may be passed by using the copy mechanism. The value of the argument is copied into the parameter on entry but the value of the formal parameter is not copied to the actual parameter on exit. In imperative languages, copying is unnecessary if the language prohibits assignment to the formal parameter. In such a case, the parameter may be passed by reference.
The copy mechanism has some disadvantages. The copying of large composite values (arrays etc) is expensive and the parameters must be assignable (e.g. expressions and file types in Pascal are not assignable).
An parameter is said to be passed by reference if the argument is an address. References to the parameter are references to the argument. Assignments to the parameter are assignments to the argument. The reference parameter of Pascal and the array and structure parameters of C++ are passed using this mechanism. A toaster provides an illustration of the effect of passing by reference.
If an argument --- A parameter is said to be passed by name if, in effect, the argument replaces parameter throughout the body of the subroutine (textual substitution with suitable renaming of local variables to avoid conflicts between local variables and variables occurring in the argument) i.e., in the subprogram, each reference to the parameter results in an evaluation of the argument in the calling environment.
Conceptually the dynamic scope rules may be implemented as follows. Each variable is assigned a stack to hold the current values of the variable.
When a subprogram is called, a new uninitialized stack element is pushed on the stack corresponding to each variable in the block.
A reference to a variable involves the inspection or updating of the top element of the appropriate stack. This provides access to the variable in closest block with respect to the dynamic calling sequence.
When a subprogram terminates, the stacks corresponding to the variables of the block are popped, restoring the calling environment. The static scope rules may be implemented as follows. The data section of each procedure is associated with an activation record. The activation records are dynamically allocated space on a runtime stack. Each recursive call is associated with it own activation record. Associated with each activation record is a dynamic link which points to the previous activation records, a return address which is the address of the instruction to be executed upon return from the procedure and a static link which provides access to the referencing environment.
An activation record consists of storage for local variables, the static and dynamic links and the return address.
Static Link Return Address Dynamic Link Local Data
The runtime stack of activation records (local data, static and dynamic links).
Figure M.N: Runtime Stack Main calls A which calls B.
Main
0 0 0 Local data for M A
SL RA DL Local data for A B
SL RA DL Local data for B
Global data values are found by following the static chain to the appropriate activation record.
In program construction the module designer must answer the following questions.
Modules must provide answers to two questions:
Functions and procedures are simple modules. Their signature is a description of what they do while their body describes how it is achieved. More typically a module encapsulates a group of components such as types, constants, variables, procedures, functions and so on.
To present a narrow interface to other modules, a module makes only a few components visible outside. Such components are said to be exported by the module. The other components are said to be hidden inside the module. The hidden components are used to implement the exported components.
Access to the components is often by a qualified name -- module name. component name. When strong safety considerations are important, modules using components of another module may be required to explicitly import the required module and the desired components.