Design and analysis of FPGA based self-timed system with ...



Design and analysis of FPGA based self-timed system with specific focus to xilinx FPGAs

M.SRIRAMAN

Electronics and Communication department

Anna University

Ceg, Anna university, guindy, Chennai-25

INDIA

Abstract:- The ASIC based self-timed systems use custom cells, which are not commercially available to the design community. Hence, Asynchronous self-timed designs have been a concept restricted to certain organizations around the world, which can afford the cost and effort. In order to make the methodology widely available it should be possible to build these systems using basic components available to the entire design community. In this dissertation work the focus will be on building a self-timed system using components available in a basic Xilinx FPGA, which is the most widely used FPGA in the world. The flow will be based on designing basic macro modules, which are consistent in timing in themselves and building bigger designs using such macro modules. At the end of the dissertation work a set of guidelines and design flow will be available to the design community to design self-timed systems using FPGAs. An example design of an arbitration based memory access module is given to illustrate the credibility and desirability of the methodology. It is part of a complex design in a larger chip and was a nightmare to meet the 125 MHz timing in Xilinx Virtex FPGA. Also it has some performance bottlenecks in terms of giving the required throughput. The self-timed design will be used to show that the design will exceedingly meet both the timing requirement and the performance requirement in the same FPGA.

Keywords:-Self timed systems, macro modules, Delay insensitive, Dual rail signaling, Bundled data signaling.

Introduction

The current day VLSI systems predominantly employ clock based synchronous systems owing to their design simplicity and predictability. Though started as early as 1950, the asynchronous design methodologies have not picked up much due to many reasons like lack of proper methodologies, possible design hazards, non-availability of synthesis tools etc. The indisputable popularity of the synchronous design methodology was also a key reason behind the design community not giving much botheration to the asynchronous design methodologies. The clock-based synchronous systems have been able to cater to the growing needs of the design community so far. But, as the complexity, size and speed of the system increases, they bring with them major issues like speed limited by the worst case timing delay, global synchronization with low skew, power consumption due all-time active clock and many more with all centered around the clock signal.

The topic of this thesis work is to analyze the gradual applicability of asynchronous designs to certain parts of synchronous designs, which can significantly boost the performance. The resulting systems are mixed synchronous-asynchronous systems. The type of asynchronous circuits dealt with in this work is also called self-timed systems since they don’t need any external signal to synchronize or time their operations. There are many issues that face the self-timed design community:

1. Implementing reliable Handshake protocols

2. Synchronization of the control and data paths, which can be delay based or Completion detection based

3. Interfacing with the external/synchronous world

4. Avoiding design hazards/glitches, which can propagate through the design

This work is towards evolving a set of guidelines that will help self-timed systems realizable using commercially FPGAs.

Scope of Dissertation Work

The dissertation work will restrict itself to:

1. Evolving a design methodology for self-timed systems, which in turn will have the flow, usage of standard macro modules detailed in this work and dos and don’ts

2. An example design of a memory access arbiter, which will be implemented and verifying using simulation

Objectives

1. Expanding the reach of self-timed systems by enabling design of self-timed systems easily available to the entire design community

2. Coming up with a flow/guidelines for the FPGA based self-timed systems so that the success is repeatable

3. Demonstrating the advantage of synchronous-self-timed design with an example

Background of the Dissertation work

The idea of asynchronous designs has been there since the birth of VLSI designs. The design community had to choose between asynchronous and synchronous design in the early 1950s. Owing to their simplicity and ease of design, synchronous design won over the asynchronous design. Still, background work has been going on in various universities and some VLSI design powerhouses like Sun Microsystems, Philips Semiconductor, Intel etc., on asynchronous clock less designs. Infact, companies like Philips have commercial products, which don’t use any clock.

The work in asynchronous design dates back to mid 1950s, with Huffman’s theoretical work on asynchronous circuits. A major breakthrough came from Muller who proposed a class of circuits, which are closer to the modern asynchronous circuits. He introduced a request-ack based 4-way protocol, which aids the control/data flow through the design. In 1969, Stephen Unger published his classic textbook on asynchronous circuit design, which presented a comprehensive look into asynchronous design methods. Following this many popular and practical asynchronous design projects have been executed to name some:

1. Using Speed independent circuits in the design of ILLIAC and ILLIAC-II computes by Muller et al in University of Illinois

2. Mainframe computers like MU-5 and Atlas constructed using fully asynchronous circuits

3. Design of the asynchronous AMULET processor in Manchester University

4. Also in the 1970s, asynchronous techniques were used at the university of Utah in the design of the first operational dataflow computer [102, 103] and at Evans and Sutherland in design of the first commercial graphics system.

5. Design of asynchronous 80C51 microcontroller by Philips

6. RAPPID project at Intel, which demonstrated that a fully asynchronous instruction length decoder for the x86 instruction set could achieve threefold improvement in speed and power compared synchronous design

The asynchronous design is gaining popularity owing to its great advantages like:

1. Low power

2. The circuit performance being dependent on average case delay rather than worst case delay as in synchronous designs

3. Easily local scalability and change flexibility of components. When components are to replaced a only a part can be changed so long as the interface remains the same without bothering about the clock frequency/speed of the adjacent blocks

4. No pain of routing and balancing low clock trees

In spite of the advantages the asynchronous design community faces many disadvantages, which is a primary cause of it not gaining enough popularity over synchronous design that it deserves. Some of them are:

1. Non-availability of standard and clear-cut flow to all

2. Non-availability of commercial tools for verification and synthesis of asynchronous circuits

3. Usage of custom VLSI components for asynchronous ASIC design by the players in the industry makes it difficult for others to use them for wide manufacturing of commercial products

4. Robustness of the design is a major challenge to be overcome by the design community

5. Testability of the Asynchronous chips

Though this work cannot solve all the issues, it intends to solve the first and the third issue.

Issues/Challenges with Synchronous design

Synchronous design has enjoyed monopoly in the VLSI based design field since its birth. The key reasons behind this are:

1. Simplicity

2. Accessible to all design community without needs for any special components that are proprietary to only certain groups around the world

3. Robustness

4. Extensive tool support

5. Matured over the years with a well defined flow and methodology

The synchronous design is a ‘ready to jump’ methodology that anyone can start designing and taping out chips with not much of difficulty. Millions of chips taped around the globe yielded a well-defined pathway to success and immense confidence among the design community of making successful chips. But, with increasing needs on speed and complexity a severe bottleneck to synchronous design methodology is imminent. The bottleneck comes from multiple angles. The next few subsections focus on the key issues/challenges limiting the applicability of synchronous design methodology to future chips.

1 The killer clock

The speed of a circuit is directly related to the quickness at which the underlying components can switch. In synchronous designs the speed of the circuit is limited by the maximum frequency at which the clock can switch. The clock is the heartbeat of a synchronous circuit. Till date the synchronous circuits have been able to scale well in terms of speed thanks to advancement in manufacturing processes. But, even with advanced sub-micron processes the single point dependency on the clock element is to bring a set of limitations that will be hard to overcome.

1. Clock skew issues

1. Designs with more than half million flops have become common these days

2. With so many flops around it is required to balance the clock appropriately so that skew between instants at which it reaches these flops is kept in less than tens of ps

3. This requirement is becoming more and more nightmarish for the backend teams due to loading and un-uniform distance factors

2. Clock switching and EM radiation

1. With increasing clock speeds the EM radiation is a serious issue to tackle.

2 Power factor

Since all the flops in the design are being clocked all the time there is a continuous power drawn by the chip all the time. Though clock gating methods have been evolved majority of flops that are in the critical design path are not gated though they may not be used all the time. These flops contribute to a significant amount of power consumed that can be saved otherwise.

3 Portability/Scalability

In a synchronous design all components that interact with each are tightly timing coupled with the assumption that they all operate with the same clock. Suppose one of the following requirements come into picture:

1. A particular block has to be scaled to higher frequency for performance improvement

1. In this case it is not possible since the other components interacting with this component also need to be scaled to higher frequency. A solution would be to pipeline the design and increase the frequency. This will not scale the performance as that of frequency scaling but will increase the performance limited by the amount of pipelining.

2. The entire design needs to ported to a different technology library

1. Here we cant be sure if the design will time close to the new technology library – particularly if the new library is inferior in timing. E.g: ASIC to FPGA conversion

4 Performance – Worst case!

In synchronous designs the highest clock speed achievable is limited by the largest the combinatorial delay between two flops in the design. This means that the design performance is limited by the worst-case delay or rather the weakest chain in the design.

Self-Timed systems

Since the invention of transistor asynchronous designs have existed. Unfortunately they did not gain enough popularity owing to their complexity and absence of any big community to aid its progress. The asynchronous design methodology holds a lot of promises that can easily overcome the challenges that face the synchronous designs. [9] gives a good account on the comparison of current day synchronous and asynchronous designs.

1 Clockless designs

The self-timed designs totally get rid of clocks. Hence, there is no central timing signal that synchronizes all the events. This will be a major relief from nightmarish clock balancing and EM radiation nightmares. The entire design operates with a set of asynchronous handshaking protocols between the blocks.

2 Cool chips – Power saving

Due to the absence of a central clock not all the components are active all the time. Only those areas of the design that are actively processing at any given time will dissipate power at a given instant of time. This will significantly save power compared to the synchronous designs.

3 Portability

Since all the components of the design are inter-connected using an asynchronous request-ack protocol, porting to a different technology to improve performance, or changing a particular part of the design for a performance improvement is naturally possible.

4 Performance – Average case!

The performance of an asynchronous design varies between the best and worst case delays depending on the input. In effect the design will operate at average case delay unlike synchronous designs that are always worst-case delay based.

Self-Timed designs – Issues/Limitations

The previous sections listed the various promises that asynchronous designs have to overcome the hurdles facing the self-timed designs in the near future. From the previous section it may appear that self-timed design will be clear winner over the synchronous designs and will be the favorite choice for the design community in future. But, it is not as simple as it sounds. Self-timed designs have a set of serious challenges that will cause their wide applicability to be postponed by more than a decade. Unless, these challenges are faced with an appropriate scalable solution self-timed design will only be a far to reach dream for the VLSI design community. Some of the challenges are as follows:

1. Availability of design components:

1. Most of the current day self-timed designs are based on custom VLSI circuits, which are accessible only to certain organization/groups around the globe. An easy to follow methodology with simple design components should be available so that the design community can jump into it easily

2. The EDA companies for synthesis and backend do not show enough interest towards self-timed systems

3. This work focuses on making self-timed systems independent of any EDA tool

2. Design Hazards

1. Since the designs are clock less combinatorial loops are inevitable. Handling the combinatorial loops and other design hazards is essential. Some works are based using special synthesis tools to handle these issues. Achieving hazard free asynchronous state machines are given in [12]. Also [18] lists some techniques for robust asynchronous designs.

3. Memory

1. Networking designs are memory intensive designs. There is no clear-cut memory based design available in literature.

2. This work gives a way to realize memories in self-timed systems, which will make its applicability to networking designs possible

4. Ease of design

1. Definitely self-timed designs are not as simple as synchronous designs. This is primarily because of some special care required in designing theses circuits and lack of exposure to these methodologies.

2. With a well-defined design/implementation approach the resistance will come down. This is one of the objectives of this work.

5. Completion detection

1. Since there is no central timing to mark the completion of operations a different method should be adopted to detect the availability of proper values of signals before operations can be started. Similarly, some method is required to mark the completion of an operation so as kick start the following operation. Comparison of various completion detection circuits are given in [10]. [13] has good account on speculative completion detection for adders

6. External interfacing

1. Since most of the chips are synchronous in nature a fully asynchronous chip will be impossible in the near future. Attempting a fully asynchronous chip is going to slow down the progress and widespread applicability of self-timed designs.

7. Testability

1. This will be a new area to explore for ASIC self-timed designs. But, for FPGA this does not matter since FPGAs are pre-tested hardware components. Some work towards this direction can be seen in [11] and [19].The thesis work focuses on evolving a design methodology for self-timed designs in Xilinx FPGAs by which it is widely accessible to design community to try out without need for any special custom VLSI circuits.

Past work on self-timed designs

There is quite a lot of work in the self-timed design area. But, majority of work focus on using custom VLSI circuits or special algorithms or special synthesis tools or hand routing for delay control. [1] and [4] give a good account on the various available asynchronous design methodologies.

On broader basis the various works in this are can be categorized under the following heads:

1. Using custom VLSI circuits:

1. [10], [13] use custom VLSI circuits for asynchronous designs

2. [14] gives a custom FPGA to achieve self-timed system

3.

2. Algorithmic

1. Petri net based designs/special synthesis tools

2. [15], [17], [12] use algorithmic level approach to asynchronous designs

3. [4] has a chapter dedicated to this

3. Globally Asynchronous and Locally Synchronous (GALS)

1. [20] analyses locally synchronous module design in a GALS system

4. Macro-module based design methodology

1. These works are closer to the current day synchronous design methodology. They approach the self-timed designs by design multiple macro-modules and putting them together to achieve a bigger design

2. [3] is a highly referred work for this thesis

3. [2] gives an FPGA example for this

Some of the issues not addressed in these works and which becomes the highlight of the current thesis work are:

1. Synchronous-asynchronous approach for control path designs with immediate applicability in current day designs

2. Memory based design

3. Reducing the number of specially designed components so that the designer has to spend less time on creating these components

4. Increasing the ease of design so as to bring self-timed designs closer to current day synchronous designs

5. Reducing the hazard areas through designs and not by tools

Rest of the paper will deal with self-timed circuits and their design.

Synchronous-asynchronous mixed design approach

In the previous sections the various issues facing the synchronous design were explained following by the promises self-timed design to solve some of these issues. At the same time self-timed systems have their own set of limitations to be widely accepted by the design community. A fully asynchronous chip is a distant reality provided current level of maturity of the self-timed design methodologies. In this thesis work a mixed synchronous-asynchronous design is attempted. In a bigger synchronous design certain parts, which are performance critical and are difficult to meet the required timing can be replaced with a self-timed equivalent. The self-timed circuit will help in meeting the required performance of the overall design. With increasing complexity and speed of the current day designs there are some requirement in synchronous designs that stand in the way of meeting the required performance:

1. Increased pipelining to meet the frequency requirement

1. To keep the frequency of operation of the chip high designs are highly pipelined. These pipelines become a headache in latency sensitive designs like arbiters. Also pipelining increases the power consumption.

2. Increasing the frequency of the design to improve the performance requires reconsideration of the pipelining

2. Highly complex designs have a lot of design blocks. In order to predictably meet the inter-module timing it is required to have affinity flopping for the inputs and outputs. All inputs should be clocked once before using in the module and all outputs should be clocked once before driving out of the module. This adds to latency

3. Memory interfaces should one affinity flopping near the module and one near the memory. This makes the memory access latency to be 5-6 cycles. Memory access arbiters greatly suffer due to this.

Every networking design has the above issues, which becomes a critical bottleneck to the designers in meeting the required performance of the design. Let us consider an example design of a memory access arbiter, which serves to multiple clients in accessing the memory. Assume that each client is a read-modify-write type of clients. Now, each client will perform the following operations:

1. Read the memory

2. Modify the read data based on some conditions

3. Write back the modified contents

Suppose the clients are to be chosen on a priority basis. In this case there are few catch points for the designer:

1. The client’s operations are limited by the speed at which the read can happen and it can proceed with further operation on the read data and write it back.

2. This read latency is again dependent on the memory access latency and latency

3. If the number of clients is more the arbiter will become complex and would require appropriate pipelining to meet the timing.

Fig 1 Synchronous memory arbiter

In the above figure a synchronous memory arbiter is shown. In this the arbitration latency can be 2-3 cycles and the memory access latency due to affinity flopping would be 5-6 cycles. Hence, the total access latency becomes 7-9 cycles. Each client would suffer a minimum of this much latency for memory accesses. This would be a major limiting factor for the performance of the designs. A self-timed system would fit into such a scenario neatly to save a lot of access cycles.

Fig 2 Self-timed design in a synchronous system

In the above example the arbiter and Memory have been replaced with a self-timed equivalent. Now, there is no pipelining and the only delays are combo and routing delays. This is surely much lesser than 7-9 cycles in the case of synchronous design. Also the total power of the circuit will reduce due to self-timed modules. One disadvantage that must be immediately visible in the example is the latency due to the synchronization at the asynchronous interface, which will be 4 cycles. The self-timed modules are advantageous if their total latency is lesser than 3-5 cycles. As the system grows more complex the arbiter will be having more latency and the self-timed system will be of sure advantage.

1 Delay insensitive control flow

Self-timed circuits apply a different type of structure to circuit design. Rather than let signals flow through the circuit whenever they are able as with an unstructured asynchronous circuit, or require that the entire system be synchronized to a single global timing signal as with clocked systems, self-timed circuits avoid clock-related timing problems by enforcing a simple communication protocol between circuit elements. This is quite different from traditional synchronous signaling conventions where signal events occur at specific times and may remain asserted for specific time intervals. In asynchronous systems it is important only that the correct sequence of signals be maintained. The timing of these signals is an issue of performance that can be handled separately. If this protocol is insensitive to delays through circuit components or the wires that are used to connect them it is known as a delay-insensitive protocol [1].

Self-timed protocols are often defined in terms of a pair of signals that request an action, and acknowledge that the requested action has been completed. One module, the sender, sends a request event (Req) to another module, the receiver. Once the receiver has completed the requested action, it sends an acknowledge event (Ack) back to the sender to complete the transaction. In order to maintain this sequence of events, the communicating modules must obey the following rules:

Rule 1 The sender must not produce a new request event until the previous request event has been acknowledged.

Rule 2 The receiver must not produce an acknowledge event unless it has received a request event.

Rule 3 After initialization, the sender may produce a new request event.

[pic]

Fig 3 Delay insensitive control flow

These rules define the operation of the modules which follows the common idea of passing a token of some sort back and forth between two participants. Imagine that a single token is owned by the sending module. To issue a request event it passes that token to the receiver. When the receiver is finished with its processing it produces an acknowledge event by passing that token back to the sender. The sequence of events in this communication transaction is an alternating sequence of request and ack events. This sequence of events in a communication transaction is called the protocol. In this case the protocol is simply for request and ack to alternate, although in general a protocol may be much more complicated and involve many interface signals. Proper operation of a protocol requires the cooperation of more than one module; a single module cannot enforce all the rules for orderly communication. Notice that in the rules for this simple protocol Rule 1 must be observed by the sender, and Rule 2 observed by the receiver. If either module breaks the rules, the protocol is also broken.

2 Request-ack protocol and timing

Following the rules for the delay insensitive protocol defined in the previous sections there are two variations of the same:

1. Two-way handshake

2. Four-way handshake

1 Two-way handshake

Every transition (high to low and low to high) of the request and acknowledge signal has a meaning. This can be called transition signalling. As shown in the fig 4, a transaction is triggered by the low to high going edge of request signal. This is acknowledged by low to high in acknowledge. The transaction completes with ack transitioning from low to high. The second transaction starts with high to low on request, which is acknowledged by a high to low in ack.

| | | |Transaction 1| | |Transaction 2| | |Transaction 3|

|Request | | | | | | |

|Request| | | | | | | |

| |1 |2 |3 |4 |

|Access Latency at |150ns |120ns |120ns |120ns |

|100Mhz | | | | |

|LUTs |96 |968 |780 |238 |

|Flip-flops |188 |496 |444 |444 |

|Logic Power (mW) |14 |13 |13 |13 |

Table 1 Comparative figures

[pic]

Figure 31 Comparative Charts

| |LUTs |Flip-flops |

|SIB |116 |149 |

|ARB |57 |24 |

|ACKGEN |38 |4 |

|MUX |576 |268 |

|MEM |188 |52 |

Table 2 Unit-wise Logic breakup

The above tables gives the comparative figures for fully synchronous and synchronous-asynchronous design of the memory arbiter. It is evident from the above table that the MUX module is the major contributor to the logic. It consumes more that 50% of the total logic. The main reason behind this is the dual rail protocol. An alternate encoding scheme would surely reduce the area to a greater extent. Also there is not much of power saving. The only advantage is the latency improvement as of now. Hence, self-timed with synchronous logic approach would be appropriate in those areas where the cost of performance is much higher than the cost of area. Surely, there are designs, which have this requirement and synchronous-asynchronous approach can be tried in those areas. This approach anyway has to wait for some improvement in area before it will be widely accepted by the design community.

Future scope

From the summary quite few improvement areas can be identified, which would open gates towards future work in this line:

1. Reduction in area by coming up with a newer encoding scheme other than dual rail

2. An interpreter that can translate synchronous design to self-timed design

3. Direct self-timed memory and Muller-c flop realization in FPGAs and ASICs

Conclusion

It has been shown that it is possible to realize successful mixed synchronous-asynchronous design that can help overcoming performance bottlenecks in the control path designs like arbiters. But, with the huge logic cost the self-timed systems have a long way to go before they get wide acceptance in among the design community. As of now the applicability of such mixed designs can be restricted to areas where performance is more critical than logic increase. Anyway, this thesis work has proven the feasibility of mixed synchronous-asynchronous design, which itself can act as a major encouragement in improving the areas like area reduction and realizing special asynchronous components in ASIC/FPGAs.

17. Reference

|Scott Hauck, Asynchronous design methodologies, Processdings of the IEEE, |

|Vol. 83, No.1. pp.69-93, January, 1995 |

|Erik Brunvand, Using FPGAs to Implement Self-Timed Systems, Computer |

|Science Dept., University of Utah, January 8, 1992 |

|Narinder Pal Singh, A Design Methodology for Self-timed systems, |

|massachusetts Institute of Technology |

|Chris Myers, Asynchronous Circuit Design, by Chris Myers, JOHN WILEY & |

|SONS |

|D. A. Edwards W. B. Toms, The Status of Asynchronous Design in Industry, |

|Information Society Technologies (IST) Programme Concerted Action Thematic|

|Network Contract IST-1999-29119 2nd Edition, Feb 2003 |

|Ivan E. Sutherland, MICROPIPELINES, Communications of the ACM, June 1989, |

|Volume 32, Number 6 |

|Al Davis and Steven M.Nowick,An Introduction to Asynchronous Circuit |

|Design, September 19, 1997 |

|Erik Brunvand, Introduction to Asynchronous Circuits and Systems, |

|University of Utah, USA |

| H. (Kees) van Berkel, Mark B. Josephs, and Steven M. Nowick Scanning the |

|Technology: Applications of Asynchronous Circuits, C |

|Fu-Chiung Cheng, Practical Design and Performance Evaluation of Completion|

|Detection Circuits, Department of Computer Science, Columbia University |

|David A. Rennels and Hyeongil Kim, Concurrent Error Detection in |

|Self-Timed VLSI Computer Science Department, USC, LA. Using Custom VLSI |

|components. Gives a good account on testability of self-timed designs. |

|Robert M Dept. Symbolic Hazard-Free Minimization and Encoding of |

|Asynchronous Finite State Machines of Computer Science Columbia |

|University New York, NY 10027, Fuhrer Bill Lin IMEC Laboratory Kapeldreef |

|75 B-3001 Leuven, Belgium, Steven M. Nowick Dept. of Computer Science |

|Columbia University New York, NY 10027 |

|Speculative Completion for the Design of High-Performance Asynchronous |

|Dynamic Adders, IEEE Async97 |

|Scott Hauck, Steven Burns, Gaetano Borriello, Carl Ebeling,AN FPGA FOR |

|IMPLEMENTING ASYNCHRONOUS CIRCUITS, Department of Computer Science and |

|Engineering, University of Washington, Seattle, WA 98195 |

|Christian D.Nielsan and Alain J.Martin, A Delay Insensitive |

|Multiply-Accumulate Unit, Computer Science Department, Cal Tech, Feb 12, |

|1992 |

|Karl M. Fant, Scott A. Brandt, Null Convention logic, Theseus Logic, Inc. |

|Tak Kwan Lee, A General Approach to Performance Optimization of |

|Asynchronous Circuits, Thesis , Cal Tech, May 1995 |

|Alain J.Martin, Robustness in Asynchronous VLSI Techniques, Department of |

|Computer Science, Cal Tech, June 1994 |

|Henrik Hulgard, Steven M. Burns and Gactano Borriello,Testing the |

|Asynchronous Circuits: A Survery, Dept of Computer science and |

|Engineering, March 1994 |

|Kip C.Killpack, Analysis and Characterization of a Locally Clocked Module,|

|Thesis by University of Utah, May 2002 |

|Virtex-II, Field Programmable Gate Arrays, Advance Product Specification, |

|V1.9, September 26,2002 |

-----------------------

Asynchronous Interface

Combo latency

Sync to async interface

CLIENTS

.

.

.

CLIENTS

CLIENTS

Self-timed Memory

Self-timed Arbiter

AFFINITY FLOPPING

ARB LATENCY

CLIENTS

.

.

.

CLIENTS

CLIENTS

MEMORY

ARBITER

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download