CSES: A Common Syntax-Directed Editing System



CSES: A Common Syntax-Directed Editing System

Yun Bai1 Qiangguo Pu1 Nikos Mastorakis2

1Computer Center, University of Science and Technology of Suzhou, Jiangsu 215011, China

2Hellenic Naval Academy, Terma Hatzikyriakou, 18539 Piraeus, Greece

Also: WSEAS, Ag.I.Theologou 17-23, 15773, Zografou, Athens, Greece

Abstract: - We have successfully designed and realized a common Syntax-Directed editing system CSES. CSES can set up a Syntax-Directed editing system for the high-level programming language described by Context-Free grammars and regular grammars. Syntax-Directed text editing and language recognizing technology combine perfectly in CSES. The paper expounds the theoretical basis, components, functions and prototype design of CSES, and briefly introduces the approaches and steps showing how to set up the Syntax-Directed editing system of the given language on CSES.

Keywords: - Syntax-Directed edit, Generator, Grammar, Template, Phrase

                                   

1. Introduction

The program editing input is the most important link in software development, its quality will directly influence the smooth proceeding of later links such as compiling, linking, running link and so on, and also influences the whole software developing process. The computer experts and scholars attach importance to them as to how to improve program editing technology and quality. Many researchers have made great efforts in researching and exploring the problems. Syntax-directed technology is the important researching topic about the problems.

Research in syntax-directed technology has obtained some encouraging achievements up till now, such as the EMILY system [1], the MENTOR system [2], the PASES system [3], the CPS system [4], the GNOME system [5], the SUPPORT system [6], the AMY system [7], the GED system [8] and so on. Though the systems do not get widespread use like text editing systems, some of them such as CPS system and GNOME system have exerted an important function in the area of teaching applications, and have revealed a good future.

By analyzing and studying the systems which are mentioned above, we know that the majority of them are relevant to specific programming languages and edit the program of the specific programming language. There are several hundred programming languages up till now. If we study corresponding syntax-directed or structure editing systems for every programming language, we would make a great waste of manpower, material and financial resources, so the research of common syntax-directed system is now urgent. We have successfully designed and realized a new common syntax-directed editing system CSES, by researching thoroughly, exploring wisely and absorbing most new research achievements during the past two years.

CSES is a dynamic, interactive and common syntax-directed editing system. Syntax-directed editing, structure editing and text editing technology combine perfectly in CSES. It can dynamically generate and maintain the syntax-directed editing system of any high-level programming language or structured entity described by context-Free grammars and regular grammars. We use the generated syntax-directed editing system to carry out syntax-directed editing or text editing for a given language or entity. CSES can generate three types of syntax-directed editing systems: pure syntax-directed editing systems, pure text editing systems and synthetic editing systems that has the syntax-directed editing function and text editing function. The text editing is a special example in CSES. The applying scope of CSES is more widespread, and its use value is more greater than the text editing system and general syntax-directed editing system.

2. The theoretical basis of CSES

2.1. The Chomsky Language Hierarchy [9]

The Languages (including formal and informal languages) are divided into four classes: recursively enumerable language class, context-sensitive language class, Context-Free language class and regular language class. They are respectively called the type-0 language, type-1 language, type-2 language and type-3 language. They satisfy the following relations: type-0 languages [pic]type-1 languages[pic]type-2 languages[pic]type-3 languages. There are four grammars corresponding to the above language classes, respectively: unrestricted grammar, context- sensitive grammar, Context-Free grammar and regular grammar. They are respectively called the type-0 grammar, type-1 grammar, type-2 grammar and type-3 grammar. We use type-i grammar to describe type-i language. The language classes and grammars are called the Chomsky language hierarchy. Chomsky hierarchy theory is an important theoretical basis of CSES.

2.2.The Definition and Description of the high-level Programming Language

The High-level languages are formal languages. They also are type-2 or type-3 languages described respectively by type-2 or type-3 grammar. The rules of describing the language are called the syntax of the language. The syntax is divided into two classes: the lexical rule set and the syntax rule set. The lexical rule set defines the lexical symbols of the language, and the syntax rule set defines the syntax category of the language.

Suppose: L is a high-level language; G is a grammar rule set to describe the language L; W is a lexical symbol set of L and is a type-3 language describing the type-3 grammar; WG is a lexical rule set to describe W; S is a syntax category set of L and is a type-2 language described by a type-2 grammar; SG is a syntax rule set to describe S, then the following relations are set up:

a) L=S+W

b) G=SG∪WG

c) L=L(G), S=L(SG), W=L(WG)

d) L=L(G)=L(SG∪WG)=L(SG)+L(WG)=S+W

Symbol “∪” is a merging operator, symbol “+” is a composing operator. L=S+W shows that L is composed of S and W.

2.3. Analyzing, Recognizing and Generating the High-Level Language Program

A program can be viewed as a string of characters chosen from an alphabet of characters. But how do we decide that the strings of characters are valid programs of the high-level language L? The research results of language theory prove that the sentences in type-2 or type-3 language are decisive, so the valid programs of L can be decided.

The process of deciding if a string of characters is a valid program is called program analysis (also called sentence analyzing). There are two methods of analyzing the program: the first is a recognition method; the second is a generating method. The process of accepting a string of characters and recognizing whether it is a valid program or not, is called the program recognition. The system to complete program recognition is called program recognizer, such as syntax analyzer or parser. Now the methods and technologies of program recognition are now quite mature and perfect. Automatic recognizing methods and technologies about lexical symbols are recognized with finite automation. The corresponding automatic recognizing systems such as LEX system [10] and YACC system [11]have been developed successfully. We call them automatic recognizers. The process of directly generating valid programs is called program generation. The system of complete program generation is called program generator, such as pure syntax-directed system. Now the methods and technologies of program generation are not at the moment quite mature and perfect. We generate the lexical symbols and syntax categories by deriving technology of the parsing tree (Figure 1.).

Figure 1. The relation of the program to the

recognizer and generator

The pi is a string or program not to be recognized and the pi’ is a valid program.

The characteristics of the recognizer are:

a) Before recognizing. we need to create a string or a program to be recognized.

b) The recognition process is performed automatically. It does not need human intervention.

The characteristics of the generator are:

a) Before generating, we do not need to create a string or a program to be generated.

b) The generating process is completed by human intervention, when a certain syntax rule has many alternates.

We know the advantages and disadvantages of the recognizer and generator are helping each other. Which is the best analyzing method? This is to be considered by the concrete situation. To the syntax entities, such as identifiers and expressions, recognizing them with a recognizer is better than generating them with a generator. To the syntax entities, such as programs, blocks and statements, generating them with a generator is better than recognizing them with a recognizer. Generally, the synthetic analysis combining the recognizer with the generator is the best way.

3. The prototype design of CSES

3.1. Template and phrase

Suppose: L is a high-level language; G is the grammar of L;SG is a syntax rule set; WG is a lexical rule set, and G=SG∪WG. The rules of the grammar G are divided into two classes: one is used to recognize the syntax entities; the other is used to generate syntax entities.

a) Template: It is the rule used to generate the syntax entity in G, denoting t, and t∈G, then t∈SG or t∈WG.

b) Template set: It includes all templates in G, denoting TG, and TG=SG1∪WG1, SG1[pic]SG, WG1[pic]WG.

c) Phrase: It is the string to be recognized and to be typed from the keyboard in L.

d) Phrase rule: It is the rule used to recognize the phrases in G, denoting p, p∈G, then p∈SG or p∈WG.

e) Phrase rule set: It includes all phrase rules in G, denoting PG, and PG=SG2∪WG2, SG2[pic]SG, WG2[pic]WG.

f) The relations between the template and phrase rules: Suppose SG=SG1∪SG2; WG=WG1∪WG2, then TG∪PG=SG1∪WG1∪SG2∪WG2=SG1∪SG2∪WG1∪WG2=SG∪WG=G, L(G)=L(TG∪PG)=L(TG)+L(PG)=L.

The sentences or program parts in L(TG) are generated by the generator according to the rules in TG. The sentences or program parts in L(PG) are first created by the text editor, then recognized by the recognizer according to the rules in PG. Generally, we have WG1=Ф, WG2=WG, #SG1B)

THEN statement

ELSE write(B-A)

In which “A>B” and “write(B-A)” are the phrases. Because the text editor accepts these phrases, the cursor moves in the phrase scope on the whole screen. If the area of accepting the phrase is small, it can be enlarged by the enlarging command, then returned to the original editing area.

3.5. The Components and functions of CSES

The CSES consists of two parts: one part is related to the language to be edited, and the other is not (Figure 4.).

Figure 4. The components of CSES

The part naturally related to the language is a database related to the language. The information in the database consists of the data information of every language defined in CSES. Every language has three databases: the grammar database, the menu database and the status database. All template and phrase rules about the language are placed in the grammar database. These rules are the rules in TG and PG. All menu drawing information are placed in the menu database. The menu database has a corresponding relation in the grammar database. Every rule in the grammar has a corresponding menu information. If we create the corresponding relation for every rule in the grammar database in the menu database, we call that the menu database and grammar database of the language are balanced. The CSES demands this balance (Figure 5.).

Figure 5. The balance relationship of the menu

and grammar database

The database in CSES consists of the databases of all languages defined in CSES(Figure 6.).

Figure 6. The structure of the database

If we want CSES to be the syntax-directed editing system of a certain language, such as PASCAL, C, FORTRAN, and so on, we must create three subdatabases of the language in CSES database by relevant functions in CSES. If the relevant language categories or TG set and PG set change, we would alter and adjust the three subdatabases of the language with the relevant functions in CSES, so that the syntax-directed editing system about the language suits the needs of the change. If we do not use the syntax-directed editing system about a certain language, we may cancel the syntax-directed editing system by canceling the three subdatabases of the language by the relevant functions in CSES.

The parts not having relationship with the language are some common controlling or maintaining modules. They are divided into two classes: one is the modules used to maintain the information in the subdatabase; the other is the module used to edit or process the program(Figure 7.).

Figure 7. The composition structure of CSES

The functions of CSES has two types: one is the functions of maintaining the database; the other is the functions of editing or processing the program.

The maintenance of the database consists of the maintaining and managing subsystem of the grammar database, the maintaining and managing subsystem of the menu database and the automatic generator of the phrase analyzing recognizer.

The maintaining and managing subsystem of the grammar database has the following functions:

a) Initializing the grammar database.

b) Inserting the grammar rules in TG and PG.

c) Altering the grammar rules in TG and PG.

d) Looking into the grammar rules in TG and PG.

e) Outputting the grammar rules in TG and PG.

f) Deleting the grammar rules in TG and PG.

g) Examining the identity of the grammar database.

h) Examining the balance between the grammar database and menu database.

The grammar rules in the grammar database are the template rules in TG or the phrase rules in PG. These grammar rules are naturally the attributive grammar rules that possess some attributes such as format denotation and repeat denotation besides normal grammar information. The template rule abut WHILE statement is defined as the following:

While-statement::= s WHILE ( s condition) s DO

r BEGIN p statement r END

in which s is a sequential format attribute; r is a return attribute; p is a repeat attribute. After taking the definition, the displaying format of the WHILE statement on the screen is shown in the following:

WHILE (condition) DO

BEGIN

{Statements}

END

The maintaining and managing subsystem of the menu database has the following functions:

a) Initializing the menu database.

b) Inserting the menu information.

c) Altering the menu information.

d) Looking into the menu information.

e) Outputting the menu information.

f) Deleting the menu information.

g) Examining the identity of the menu information.

h) Examining the balance between the menu database and the grammar database.

The automatic generator of the phrase recognizer is to accept the phrase rules in the grammar database, and then to generate the analyzing statuses and put in the status database. If the phrase rules are type-2 grammar rules, the generated statuses are LR statuses. If the phrase rules are type-3 grammar rules, the generated statuses are DFA statuses.

The editing and processing functions:

a) Text Editing.

We designed a whole screen text editor for CSES. The editor can edit the phrases. It can complete the operations to insert, alter, delete and copy on the characters. The editing area can be enlarged and returned by the relate commands.

b) The Analyzing and Recognition of the Phrases.

We designed a phrase analyzing recognizer for CSES. The recognizer is used to carry out similar functions to the parser in the compiler. It can decide whether a phrase is valid. If the phrase has a syntax error, it would refuse to accept the phrase and request to alter the phrase. If the phrase has not syntax error, it would accept the phrase.

c) The insertion, deletion, moving, copying, hiding, returning and so on of the template.

We designed a template generator and an abstract syntax tree adjuster for CSES to complete the operations. The run of these operations are all to adopt the menu displayed.

d) Cursor movement.

We designed a module to decide the cursor position and a module used to transform the abstract syntax tree into the displaying text and a module used to display the text on the terminal screen. The cursor can only move between the syntax entities in the program.

e) Other functions.

Besides the functions mentioned above, CSES has other functions such as menu hiding and reappearing, the transformation between displaying text and abstract syntax, etc.

3.6. The Prototype Structure of CSES

The prototype structure of CSES is shown in Figure 8.

Figure 8. The prototype structure frame of CSES

4. The approaches and steps to set up a syntax-directed editing system about the given language on CSES

Suppose: L is a high-level language such as PASCAL language.

a) Using BNF to describe grammar rules about L.

b) Dividing all grammar rules into TG set and PG set.

c) Adding the attribute for the rules in TG and PG sets to make them into the attributive grammar rules.

d) Creating every attributive grammar rule in the grammar database by the grammar managing subsystem.

e) Defining the related menu and creating them in the menu database by the menu managing subsystem.

f) Running the automatic generator, generating the analyzing statuses in accordance with PG set and putting them into the status database.

g) Readjusting the information in the grammar, menu and status database with the grammar managing subsystem, menu managing subsystem or automatic generator, so that CSES adapts the change of editing feature.

5. Conclusion

CSES system is a new Syntax-Directed editing system. We can create the Syntax-Directed editor about any high-level programming language by CSES. CSES may has different characteristic by changing the TG and PG set of the language. CSES has all very well versatility and agility. It is used to the part of the incremental and integrated software development environments. We will add run-time debugging functions into CSES like Cornell program synthesizer, so that CSES will become a perfect developing environment.

References

[1] W. J. Hansen, “Creation of hierarchical text with a computer display”, Argonne National Laboratory Technical ANL, 1971, pp. 7818-7825.

[2] V. G. Donzeau, “A Structure-Oriented Program Editor: A First Step Towards Computer Assisted Programming”, Proc. International Computing Symposium, Antibes, 1975.

[3] E. Shapiro, G. Collins, L. Johnson, and J. Ruttenberg, “PASES: A Programming Environment for Pascal”, Schlumberger Programming Environment Workshop, 1980.

[4] Teitelbaum, Tim, and Reps, Thomas, “CPS-The Cornell Program Synthesizer: A syntax-directed Programming Environment”, CACM, 24(9), 1981, pp. 563-573.

[5] D. B. Garlan, and P. L. Miller, “An Introductory Programming Environment Based on a Family of Structure Editor”, ACM SIGPALN Notices, 19(5), 1984.

[6] M. V. Zelkowitz, “A Small Contribution to Editing with a syntax-directed Editor”, ACM SIGPLAN Notices, 1984.

[7] P. T. Hadjikomninges, and S. L. Stepoway, “Amy: A syntax-Guided Editor for Pascal”, Proc. of the Nineteenth annual Hawaii international conference on system sciences, 1986, pp. 470-479.

[8] T. Hamada, and H. Miyauchi, “Automatic Generation of Syntax-Directed Editor”, Electron communication (Japan), 1986, 69(1), pp: 61-71.

[9] J. E. Hopcroft, and J. D. Ullman, “Introduction to automata Theory, languages and Computation”, Addision-Wesley Publishing Company, 1979.

[10] M. E. Lesk, “LEX: A Lexical Analyzer Generator”, Bell Lab., CSTRR, 1975.

[11] S. C. Johnson, “YACC: A Yet Another Compiler-Compiler”, Bell Lab., CSRR, 1974.

-----------------------

Recognizer

Generator

pi

pi’

pi’

(a)

(b)

Screen

Command window1

Command window2

Editing window

Menu

window

Text editor

Phrase analyzing recognizer

Status set

PG set

Automatic generator

Phrase

string

Phrase

string

Valid

phrase

CSES

The part

related to

language

The part

not related to language

g1

g2

gn

m1

m2

mn

grammar

database

menu

database

L1 L2 Ln

Grammar

database

Menu

database

Status

database

L1

L2

Grammar

database

Menu

database

Status

database

Ln

The database

in CSES

Language

catalogue

CSES system

Database

Maintaining the database editing and processing the program

L1

Ln

Maintaining

the database

Editing and processing

the program

Grammar

managing

subsystem

Menu

managing

subsystem

Automatic

generator

Grammar

Menu

Language

catalogue

Menu

database

Status

database

Grammar

database

Cursor localizer

Template

generator

Syntax tree

adjuster

Text editor

Text tree transducer

Abstract syntax tree

Text buffer

Tree text

transducer

Phrase

recognizer

Stack

Text displaying

Total control and command interpreter

Start

menu

database

grammar

database

m1

m2

mn

g1

g2

gn

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download