Section III - Program Implementation



Programming Languages

On completion of this chapter, you will be able to:

• Distinguish between low-level and high level programming languages.

• Differentiate between an assembler, an interpreter, and a compiler.

• Differentiate among the five different generations of programming languages.

1 Introduction

There are many programming languages that have been developed to code programs. As of writing this document, I have counted in excess of 560. As many as there have been some are extinct. Although there are so many, these languages can be grouped into two categories – low-level languages and high-level languages. The classification of these two types of languages gives rise to the different generations of programs. As we will see, low-level languages describe two generations of programs – first generation and second generation. High-level languages give rise to what is describes as third generation languages. There is also forth and fifth generation languages. Every generation language as you might suspect by now, has features that are different from one another. This document will help you to understand the similarities and differences of one generation of programming from another.

2 Low Level Languages

These types of programming languages are machine dependent. That is, the code written for one type of machine or processor cannot be understood by a different type of processor. This is because every class of processor is designed with its own set of primitive instruction codes.

1 First Generation language

The first type of low-level language is called first generation language, sometimes called machine language. As with all first generation language, their codes must be written in binary digits. As you know, a computer (the processor) can only understand binary information. As such, first generation languages do not need a compiler or interpreter to run. The processor for which the language was written is able to run the binary code directly. The following section takes a closer look at machine language programs in general.

1 Machine Language Instruction

Let us design a four address hypothetical computer. That is, each instruction for this machine is contained in four fields, as shown in Figure 1

Figure 1 Format for one machine instruction code

The field, operation code, represents some primitive operation of the processor, such as arithmetic operations. The field, register number, represents one of the registers of the processor; and field, memory address, represents an address of primary memory (RAM). We will make each field four bits long. Also, let us design this machine with one accumulator register within the ALU, so that when certain operation is performed, the result of that operation is placed in the accumulator. The accumulator register will be implied within the instruction. Let us design this machine with another feature. It has seven basic operations that are indigenous to it – load, copy, add, subtract, multiply, divide, and store. Figure 2 shows the binary equivalence of each operation.

|Basic operations |Binary code representing operations |

|Load |0000 |

|Copy |0001 |

|Add |0010 |

|Subtract |0011 |

|Multiply |0100 |

|Divide |0101 |

|Save |0110 |

Figure 2 some basic operations

We will also design this machine with eight registers. See Figure 3

|Register Number |Binary equivalent |

|0 |0000 |

|1 |0001 |

|2 |0010 |

|3 |0011 |

|4 |0100 |

|5 |0101 |

|6 |0110 |

|7 |0111 |

Figure 3 register number and their binary equivalent

Registers 0 through 4 are designed for programmers’ use, while registers 5 through 7 are set aside for certain systems operations.

Let us assume that this machine has 256 bytes of RAM. The address number will go from 0 through 255 base 10 (0000 0000 – 1111 1111 base 2). See Figure 4.

Figure 4

Remember that the memory address of an instruction can only take four-bit codes 0000 through 1111. In this case only the four rightmost bits in the instruction can be used for memory address.

The complete list of these memory addresses is shown in Figure 5

|Address |Binary | |Address |Binary |

|0 |0000 | |8 |1000 |

|1 |0001 | |9 |1001 |

|2 |0010 | |10 |1010 |

|3 |0011 | |11 |1011 |

|4 |0100 | |12 |1100 |

|5 |0101 | |13 |1101 |

|6 |0110 | |14 |1110 |

|7 |0111 | |15 |1111 |

Figure 5

As part of the specification of this machine, the address field denoted by 0000 will always mean that the field contains no information pertaining to this instruction. In short, the instruction will not involve this field. The field for operation code if contains 0000 will maintain its meaning, load. Based on the above specification for this hypothetical computer, a typical instruction would be:

Figure 6 One machine instruction

In our example we have separated each field, but as far as the processor is concerned, it sees only one string of binary digits as in: 0001001001000000. In this example, let us suppose that register 2 contains the value 20, and register 4 contains 5; after the execution of this instruction the accumulator will contain the value 25.

Example

Given the following pseudocode algorithm, write the equivalent machine language version.

• Load the accumulator with the value stored at memory location 12.

• Add the content of register 2

• Multiply the content of register 3

• Store the result at memory location 12.

See Figures 8.xxx for the binary equivalent of respective operation code. See Figure 7 for the binary equivalent of the respective register; and see Figure 7 the binary equivalent of the memory address. The result of these combinations is shown below. A space is placed between each group of digit that represents field in the instruction for readability.

Figure 7

As we have stated earlier, each type of machine has its own machine instruction set. If we have another type of computer whose arithmetic operation codes are different from the current example, then this other machine cannot execute the instructions designed for the first machine. Figure 8 shows the instruction set for a different type of machine.

|Basic instruction |Binary equivalent |

|Load |0001 |

|Copy |0010 |

|Add |0011 |

|Subtract |0100 |

|Multiply |0101 |

|Divide |0110 |

|Store |0111 |

Figure 8

If we use the previous machine instruction code on this machine, by now you should see that it will not work, since the codes for the operation fields are not compatible. In the first machine the operation code for load is 0000, and the second it is 0001.

The major advantage of first generation language is that the code runs very fast and efficiently because it is directly executed by the CPU. That is, the code does not need any translation. The major disadvantages are: the code is machine dependent, and the series of 1s and 0s make it tedious for the programmer to write, especially for long programs. This is a recipe for making typographical errors.

2 Second Generation language

The second type of low-level language is called second generation language, popularly called assembly language. As with all second generation language, their instructions are not written in binary digits as we saw with first generation language. Instead, operation codes use mnemonic codes (short abbreviated words); registers are written as Rn, where n denotes a number; and memory addresses are written as hexadecimal numbers, or sometimes are referenced by register names. This makes programming much easier than trying to program using binary numbers.

1 Assembly Language

As you know, computers (processor) can only understand binary information. In this situation you will need a programming language that understands the mnemonic program codes, and can translate the mnemonic codes into binary codes. Every type of computer has its associated second generation programming language. This programming language is called assembly language program. The assembly language program code that you would type is called the source code, sometimes called source program. Now, the source code needs a program to translate/convert it into machine language instruction. This translated version of the source code is called object code or object program, as it is sometimes called. A program that translates assembly language source code into machine instructions is called an assembler. Two most popular assemblers for the personal computer are Borland Turbo Assembler (TASM) and Microsoft Assembler (MASM). Figure 9 shows the flow of programming activities necessary when using second generation language.

Figure 9

As with the case of machine language instruction, the object code that is generated by an assembler can only run on machines of the same kind, because the object code is indeed machine language instructions. It happens this way, because assembly language is a machine specific programming language. It has a one-to-one correspondence between each of its statements and computer’s indigenous machine language.

3 High Level Languages

A high level programming language is any computer programming language in which instructions are written in a language that resembles human (natural) language. That is, their codes are further from machine languages of the first generation languages. See Figure 10

Figure 10

Because of the difficulties encountered using low-level languages, high-level languages were developed to make programs easier to write and to read. These languages use words that more clearly describe the task being performed. In general, the main advantages of high-level languages over low-level languages are that they are easier to read, write, and maintain. High level languages span third generation, fourth generation, fifth generations, and beyond. The rest of the chapter describes these generations of programming languages.

1 Third Generation Languages

Third generation languages are largely procedural. That is, they concentrate more on how to do something, rather than describing how something gets done. Procedural programming languages follow similar pattern to how solution algorithms are designed as described in Chapter 4. Another feature of third generation languages is that the programmer’s code called source program, also known as source code, must be translated into machine language. This requires special translator programs to convert the source code into machine code.

There are two kinds of translators used to translate and execute a program - compiler and interpreter. Both translators have one thing in common – they convert the source code into machine language. The process by which they carry out the translation and interpretation of source code differs. In the case of the interpreter, it translates and executes each line of codes one line at a time. Thus if the program has syntax errors (violation of program rule) lower down in the code, you never know until the interpreter reaches to that statement. Compilers on the other hand make sure that there is no syntax error in the program before it starts to execute the program. As a result most people prefer a compiler to an interpreter. Generally an interpreter is easier to learn than a compiler; but a compiler gives a better trade off. That is, once the source code is compiled, this compiled version called the executable code can be executed repeatedly without having to re-compile the source code. Interpreters do not behavior this way. Each time that the program is to be executed, the source code must be re-translated. Figure 11 shows the flow of activities carried by an interpreter – from creating the program, to translating the program, to executing the program.

Figure 11

The programming process begins the time you create or type the program as shown in Figure 12. The code you type is called the source code. In the case of the interpreter, when it is supplied a program, it checks if there are more lines of code in the file. If the answer is no, then this signifies that the source code ends. If there is another line of code in the source file, it is interpreted. If the code has at least one syntax error, the program halts, but this time abnormally. The programmer has the option of fixing the error and re-submits the entire code to be interpreted again. If there are no errors in the code, that line of code gets executed, and the process starts over for the next line of code, until the entire program is interpreted and executed.

Figure 12 shows the flow of activities carried by a compiler – from creating the program, to translating the program, to executing the program. As oppose to the interpreter, the compiler makes sure that there are no syntax errors in the entire source code, before it starts to execute it.

Figure 12

The process of compiling and executing a program is more complicated than interpreting a program. First we will look at the similarities, and then the differences. As in the case of the interpreter, the programming process begins the time you create or type the program as shown in Figure 12.. The compiler examines all of the source code to make sure that all lines of codes are free of syntax errors. If there should ever be any syntax error, the error must be fixed, and the program is re-submitted for compilation. Once it is determined that the source code has no syntax error, the compiler produces and output code in binary, called object code.

When a programmer writes a program, it is rare that he/she writes all the codes necessary for the program to work. For example the codes for input and output are difficult and long to write, so the compiler designer usually write them for us, and store them in file(s) commonly called library. The next step in the compilation process is to combine some of those codes with your object code to form another file of codes called a load module. The compiler has another module called the linker, which links the object code and the library codes. The linker typically stores this new version of your program in a file with extension “.exe”. If the linking is unsuccessful, the necessary corrections have to be made, and the compilation process begins again. If the linking is successful, a third module of the compiler places the load module into the primary for the program to be executed. It is at this stage that the program is supplied data, and that we get output.

Although the program may compile and link, this does not guarantee that it will execute successfully; another type of error could occur. This kind of error is called Exception. It occurs only during runtime; hence it is also called runtime error. When an exception occurs, the program is aborted abruptly because of some unforeseen reasons. For instance, if a program attempts to divide by zero, or read from an empty file, or read from a file that does not exists; the computer would abort the program instantly. Essentially, exceptions are impossible tasks that the program is requesting the machine to perform. Its response therefore is to terminate the execution of the program abnormally. When this happens, the programmer must once again fix the problem and re-submit the program for re-compilation. Lastly, if the solution is incorrect, due to logic error, then the correction has to be made, and the compilation process begins all over again.

Another feature of third generation language is that one statement (instruction) in any of these languages generally generates several machine languages instructions. This feature is true for all high level programming languages. For instance, the algebraic statement:

Y = ( Y + X ) * Z

is coded just as you see it here in Java, C and Basic languages. In Pascal it is coded as:

Y := ( Y + X ) * Z. Notice that it is only the assignment symbol that is different. When this statement compiled/interpreted however, the assembly codes generated are similar to what is shown in Figure 13

Figure xxx

Figure 13

In Figure xxx we see that the single high level language statement y = ( y + x ) * z , when compiled, generates six assembly language statements, which intern generates six machine instructions.

Some of the most featured third generation languages are C, C++, Pascal, FORTRAN, COBOL, and Basic. All but Basic are compilers; Basic is an interpreter.

2 Fourth Generation Languages

Fourth-generation languages is so named, because it shows a direct departure from the previous generations. Fourth-generation languages instructions are not written in binary, neither are they are not written in assembly format either. We know from the previous sections that the third-generations are procedural in nature. That is, they concentrate on how things get done. Fourth-generation languages do not fit this model either. Instead, Fourth-generation languages describe what is to be done in a more or less natural language format. That is, they state the goals to be achived, but they do not list the steps to achieve the goals. The three most typical feature of fourth-generation languages are:

• They are non-procedural. As mentioned they do not focus on how the task gets done, but rather their instruction focus on what needs to be done.

• They use English-like phrases and sentence formats to issue instructions.

• As a result of these two previous points, fourth-generation languages are favoured in the industries which depends on data retrieval and queries. Employees in this environment do not necessarily have to be knowledgeable in computer science, instead they are usually trainable individuals who can write commands. It is common knowledge that fourth-generation languages increases productivity in the work place.

One broad area of applications where fourth-generation languages are used are structure query language (SQL). In a relational database, data are stored in a table as shown in Figure xxxxx. The first row of the table shows the attributes of the table, Id, LastName, etc. Lets say the name of the table is Employee. Let’s say you want to see the firstname, lastname, and position of all employees who earn $50,000.00 or more. In fourth-generation languages, a typically commaned would be somewhat like this:

SELECT FirstName, LastName, position FROM Employee WHERE salary >= 50000 ;

A command like this does not require programming knowledge to understand. The individual can be trained to construct queries such as this.

|Id |LastName |FirstName |Salary |Position |

|M-0010 |Smith |James |50000 |Manager |

|F-1000 |Richards |Mary |75000 |Manager |

|F-2000 |Hammond |Berrisford |40000 |Staff |

|M-0010 |Smith |Maureen |28000 |Part-Time |

|M-0030 |Harvey |Val |55000 |Staff |

Figure 14

In this case the required information would be:

James Smith Manager

Mary Richards Manager

Val Harvey Staff

3 Fifth Generation Languages and Beyond

As oppose to first, second, third and fourth generation languages, fifth generation languages around the concept of solving problems using constraints, rather using algorithm. By this we mean, constraint solving is the solving of problems by giving constraints (conditions, properties) pairs, which must be satisfied by a solution to the problem.This way, the programmer only needs to worry about what problems need to be solved and what conditions need to be met, without worrying about how to implement any algorithm to solve them.

1 Artificial Intellegence (AI)

Fifth-generation languages are used mainly in the area of artificial intellegence (AI). The field of artificial intellegence focuses on areas such as:

• Deductive reasoning – Like the human beings, this branch of AI tends to solve most of their problems using fast, intuitive judgments rather than the conscious, step-by-step deduction approach.

• Knowledge representation – This is a method whereby knowledge about a topic or an object is stored in an expert system. The knowledge is typically a series of IF condition-THEN take action rules.

• Machine learning – This area AI focuses on finding patterns in data. The idea behind machine learning is to replace the writing of code with the supplying data to the computer, and then let the computer figure out what information is needed from the data, by looking at some examples that have also been fed to it. The main idea after all of this, is to have the computer a supply generalized solution beyond the just the examples that were given to it.

• Natural language processing - This branch of study is concerned with the interactions between computers and human’s natural languages. That is, they can convert information from computer databases into readable human language. They can even convert samples of human language into more formal representations that are easier for computer programs to manipulate.

• Motion navigation - The field of robotics is one such area of AI. That is, intelligence is required for robots to be able to handle such tasks as manipulating object and navigating the positions of objects. That is, knowing where the objects are, learning what is around them and figuring out how to get an object from one location to another.

Beyond Fifth Generation Languages

Beyond fifth generation language (AI) are two new programming paradigms – visual programming language (VPL) and object oriented programming (OOP).

1 Visual Programming Language

Visual programming language, as the term suggests, are those programming languages which lets you build programs using icons rather than using text. That is a VPL are based on the idea of using buttons, text box, check boxes, command buttons, labels and connecting arrows to create graphical user interface (GUI) programs on your computer screen.

Visual Basic (VB) is one of the forerunners of VPL. Its main attractiveness, unlike many other languages, is the ease with which it allows the programmer to create appealing looking, graphical user programs with little coding. Other programming languages may require hundreds of lines codes, and several hours of programming. The way that most VPL work, is that as the programmer lays out the buttons, labels, arrows, etc, on the GUI form area, much of the program code is automatically generated by the program itself. The language provides us with a tool box, from which you can select the various items to build your GUI. Figure 14. show a typical Visual Basic tool box. Most VPL, including VB, follow this three step generic format when developing an application:

1. Design the appearance of your GUI application before setting it up.

2. Assign property settings to the objects of your GUI program.

3. Write any necessary code to direct specific tasks at runtime.

Figure 16

Figure 17

2 Object Oriented Programming (OOP)

Object oriented programming (OOP), which also, is beyond fifth generation programming, is a type of programming that defines both the data and any operations that can be perform on the data, as a complete program unit. In this way, the entire unit, both data and operations, is referred to as the object. In this paradigm the data declarations are called fields, and the operations are called methods. Both the fields and the methods are stored as one unit, typically called a class.

Writing object-oriented programs require an object-oriented programming language (OOPL). Three of the more popular object oriented languages are Java, C++ and Smalltalk . The following running example gives an understanding of some of the features of OOP. We will use the Java programming language to highlight these features. This section does not intend to teach Java, it is only to demonstrate a few features of object oriented programming. In the following running example we will limit our discussion to finding area of some type of two dimensional surfaces, and volume of some three dimensional figures as well.

After this brief introduction to object oriented programming, we discover that the major advantages of OOP are:

• Modularity - each object forms a separate entity with its own set of data and its own set of operations. This concept called encapsulation protects the data, which makes it difficult, if not impossible for objects outside of the system to access the data.

• Modifiability - it is easy to make minor changes to the entity in terms of the data representation or operations. In addition, changes inside a class do not affect any other part of the program.

• Extensibility – the concept of inheritance allows you create new entities from existing ones and new features. This extended entity becomes a unique feature of the original entity.

• Maintainability – because each object is distinct, it makes it easier to maintain an entire system by modifying only those entities that require changes. Any changes made to an entity usually has little or no change to existing modules; thereby reducing programming time, reduces maintenance costs, and reduces program development time.

• Re-usability – a given enitity can be reused in different programs at any time.

• Simplicity –object oriented programming models real world objects. The programming is built around the concept of: what fields are involved in the object and what operations are required on these fields. This concept reduces program complexities and it makes the program structure is very clear and easily understood.

-----------------------

Program aborts

(Abnormally termination)

Execution

(Perform task)

Interpreter

(Check syntax)

Program ends

(Normal termination)

Another line of code?

Create/Edit

Source code

Execute Program

Load

(Place load module in memory)

Output

Object code

Operation code Register number Register number Memory address

Registers 5 through 7 systems use

Registers 0 through 4 programmers’ use

0001 0010 0100 0000

Add

The content of Register 2

To the content of Register 4

LDA 12

ADD R2

MUL R3

STO 12

TASM/MASM Assembler

0000 0000 0000 1100

0001 0010 0000 0000

0100 0011 0000 0000

0110 0000 0000 1100

4 bits 4 bits 4 bits 4 bits

Compile

(Entire program)

High level languages language

Assembly language

Hardware

Machine language

00010011

0001 0010

:

:

:

:

0001 0001

0001 0000

:

1111 1111

:

:

0000 1111

0000 1110

0000 1101

0000 1100

Address of RAM

Primary Memory (RAM)

Linker - combines

Object code and library code

(Output = Load module)

Assembly language source code that you typed using a text editor

Assembler such as Borland Turbo Assembler (TASM), or Microsoft Assembler (MASM), converts source code to machine language

Assembler takes source as input and translates it into machine language

Create/Edit

Source code

No

Yes

0000 0000 0000 1100

0001 0010 0000 0000

0100 0011 0000 0000

0110 0000 0000 1100

Load the accumulator from memory location 12

Add the contents of register 2

Multiply the contents of register 3

Store the result in register 12

Text box

Vertical Scrollbar

Picture

Button

Combo box

Radio button

Check box

Label

No

Yes

Yes ! ! !

It Works !!

Logic error?

(Answer correct?)

Link

(Successfull?)

Yes

Compiler/

(Syntax error?)

No

Timer

Image

Line

Data control

Shape



0000 0000 0000 xxxx

0001 0010 0000 . . . .

0010 0010 0000 0000

0001 0010 0000 . . . . 0100 0010 0000 0000

0110 0000 0000 xxxx

Line of code has error?

LDA Y

MOV R2, X

ADD R2

MOV R2, Z

MUL R2

STO Y

Load accumulator from memory location Y

Copy memory location X into register R2

Add contents of accumulator and R2.

Copy memory location Z into register R2

Multiply accumulator by R2.

Save value in accumulator in memory Y

No

Yes

Yes

No

I’m learning to Program !

Assembly language

Machine language

Runtime Error?

(Exception)

No

Yes

No

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download