The ARM Processor



ARM PROCESSORS AND ITS APPLICATIONS

ARM PROCESSOR: -

An ARM processor is any of several 32-bit RISC (reduced instruction set computer) microprocessor s developed by Advanced RISC Machines, Ltd. The ARM architecture was originally conceived by Acorn Computers Ltd. in the 1980s. Since then, it has evolved into a family of microprocessors extensively used in consumer electronic devices such as mobile phone s, multimedia players, pocket calculator s and PDA s (personal digital assistants).

ARM processor features include:

• Load/store architecture

• An orthogonal instruction set

• Mostly single-cycle execution

• A 16x32-bit register

• Enhanced power-saving design.

ARM provides developers with intellectual property (IP) solutions in the form of processors, physical IP, cache and SoC designs, application-specific standard products (ASSPs), related software and development tools — everything you need to create an innovative product design based on industry-standard components that are ‘next generation’ compatible.

RISC MACHINES: -

RISC is an acronym standing for "Reduced Instruction Set Computer", contrasted with a CISC machine (Complex Instruction Set Computer).RISC claims of simplicity in comparison to CISC:

1. Fixed 32-bit instruction size instead of variable .

2. Large register bank of GPR 32-bit registers.

3. Easier to prototype and put together.

RISC Organization:

1. Hard-wired instruction decode logic instead of microcoded ROMs to decode

2. Pipelined execution

3. Possible single cycle execution

RISC Advantages:

1. Smaller die sizes

2. Shorter time to develop

3. Possible higher performance than CISC

4. High clock rate with single cycle

RISC Disadvantages:

1. Generally less code density than CISC

2. Cannot execute x86 code, at least not without some sort of conversion and performance drawback

PROCESSOR OVERVIEW :-

ARM offers a wide range of processor cores based on a common architecture, that deliver high performance together with low power consumption and system cost.

The ARM processor range provides solutions for:

• Open platforms running complex operating systems for wireless, consumer and imaging applications.

• Embedded real-time systems for mass storage, automotive, industrial and networking applications.

• Secure applications including smart cards and SIMs.

The ARM processor is a powerful low-cost, efficient, low-power (consumption, that is) RISC processor. It's design was originally for the Archimedes desktop computer, but somewhat ironically numerous factors about its design make it unsuitable for use in a desktop machine (for example, the MMU and cache are the wrong way around). However, many factors about its design make it an exceptional choice for embedded applications.

PROCESSOR FAMILIES:-

There are currently eight product families which make up the ARM processor range:

• Cortex Processor Family 

• ARM7 processor family

• ARM9 processor family

• ARM9E processor family

• ARM10E processor family

• ARM11 processor family

• SecurCore processor family

Further implementations of the ARM architecture are available from our Partners such as the Intel® XScale™ microarchitecture.

ARM ARCHITECTURE :-

• v1 - ARM1.

• v2 - ARM2.

• v2as - ARM3 & ARM250.

• v3 - ARM6, ARM7, ARM8 & Amulet 1.

• v3M - various ARM6, 7 & 8 variants.

• v4 - StrongARM, ARM9.

• v5 - ARM10.

• VFP1 - ARM10 (some variants of).

• Thumb (T variants).

• Long multiply instructions (M variants).

• Enhanced DSP instructions (E variants).

The ARM is a 32-bit machine with a register-to-register, three-operand instruction set. All operands are 32-bits wide. The ARM has 16 user-accessible general-purpose registers called r0 to r15 and a current program status register, CPSR. Register r15 contains the program counter, and register r14 is used to save subroutine return addresses (r14 is also called the link register, lr).

The ARM has more than one program status register (i.e., CPSR) as figure 1 demonstrates. In normal operation the CPSR contains the current values of the condition code bits (N, Z, C, and V) and 8 system status bits. When an interrupt occurs, the ARM saves the pre-exception value of the CPSR in a stored program status register (there's one for each of the ARM's five interrupt modes). The ARM runs in its user mode except when it switches to one of its other five operating modes. Interrupts and exceptions switch in new r13 and r14 registers (the so-called fast interrupt switches in new r8 to r14 registers as well as r13 and r14). When a mode switch occurs, registers r0 to r12 are unmodified.

Figure 1 The ARM's register set

[pic]

Summary of the ARM's Register Set

The ARM has sixteen accessible 32-bit registers called r0 to r15 at any one time

r15 acts as the program counter and r14 (called the link register) stores the subroutine return address

You can write PC for r15 in ARM assembly language, lr for r14, and sp for r13

The ARM has a current program status register, CPSR, that holds condition codes

Some registers are not unique because processor exceptions create new instances of r13 and r14

Because the return address is not necessarily saved on the stack by a subroutine call, the ARM is very fast at implementing subroutine return cal

PROCESSOR TYPES:-

ARM 1:

• Very first ARM processor.

• It was first manufactured in April 1985, and was the very first commercial RISC processor.

• It was "working silicon" in it's first incarnation, it exceeded it's design goals, and it used less than 25,000 transistors.

The ARM 1 was used in a few evaluation systems on the BBC micro (Brazil - BBC interfaced ARM), and a PC machine (Springboard - PC interfaced ARM).

It is believed a large proportion of Arthur was developed on the Brazil hardware.

It is very similar to an ARM 2 - the differences being that :

• R8 and R9 are not banked in FIQ mode,

• there's no multiply instruction,

• no LDR/STR with register-specified shifts,

• and no co-processor gubbins(No co-processor interface or co-processor instructions)

ARM 2:

additions as the MUL and MLA instructions allowed for real-time digital signal processing. The ARM2 chip features 27 registers of which 16 are accessible at any one time. Four processor modes are available -

USR : user mode

IRQ : interrupt mode ( with a private copy of R13 and R14.)

FIQ : fast interrupt mode ( private copies of R8 to R14.)

SVC : supervisor mode. (private copies of R13 and R14.)

Only non USR mode code may change the processor mode providing hardware security if the hardware and physical memory is only accessible from privileged code. Due to the top six bits of the program counter being used to hold the processor status flags this chip was restricted to addressing 26 bits of memory, or a 64 Megabyte address space. In actuality there are eight bits of processor status held in the PC register. Because an ARM instruction is always four bytes long the bottom two bits of the PC were always an implied zero when the register was being used as a PC. When that register is used for other operations the bottom bits reflect the mode the processor is operating in. (00 - USR, 01 - IRQ, 10 - FIQ & 11 - SVC)

A three stage instruction pipeline allows the chip to execute instructions quickly with a fairly low transistor count. One side effect of the pipeline is the ability to get a 'free' rotation/shift on every instruction as one stage of the pipeline dealt exclusively with a barrel shift of a given register. Combined with the condition execution of every instruction then long runs of code without branches, which stall the pipeline, could be achieved allowing a fairly high instruction execution speed for the clock rate. (About 0.6 instructions per clock cycle on average)

The ARM2 chip was clocked at 8 MHz giving an average performance of 4-4.7 MIPS.

1. ARM 3:

This is an ARM2 core macrocell with a cache and dedicated coprocessor interface added. The register set was unchanged and no new processor modes were added. What was new, in the ARM3 chip produced, was the addition of an on chip cache (4Kbyte, 64 way set associative, random replacement, 4 word lines, write through, mixed data and instructions.) and much faster clock speeds. Also new were adjustments to the co-processor interface on the chip including defining co-processor fifteen to be cache control and chip identification.

Finally one new instruction was added, the SWP instruction. A monotonic register to memory swap command useful for multi-processor arrays.

Several speeds of ARM3 chips were produced. Initially 26 MHz varieties were released with the A540 machines, then 25 MHz versions were used in the A5000 and 24 MHz ones in the A4. Finally a 33MHz version was produced and used in the alpha variant of the A5000.

A second incarnation of the chip was as the ARM250 which was a 12MHz variant of the ARM3 cell and had the IOC1, VIDC1a and MEMC1 chips all integrated into the one chip but unlike the normal ARM3 it had no processor cache. The ARM250 delivered about 7 MIPS performance.

A 24 MHz ARM3 using a 12MHz main memory will produce an average speed of execution of 13.26 MIPS. At 33 MHz 17.96 MIPS is delivered.

2. ARM 4 & ARM 5:

These were never made. In the change over from Acorn to Armltd designing the processors the number scheme for the chips was changed. As such the numbers 4 and 5 were skipped.

3. ARM 6:

This processor cell is the first of the commercially available ARMs to have a full 32bit addressing capability. Additionally the processor now has 31 registers in it along with six new processor modes :-

• User32 - 32 bit USR mode.

• Supervisor32 - 32 bit SVC mode. (private SPSR register)

• IRQ32 - 32 bit IRQ mode. (private SPSR register)

• FIQ32 - 32 bit FIQ mode. (private SPSR register)

• Abort32 - Memory fetch abort more. (private SPSR register)

• Undefined32 Undefined instruction mode. (private SPSR register)

The SPSR register is a Saved Processor Status Register and holds a copy of the CPSR (Current Processor Status Register) when the new mode is entered. The addition of the Abort32 mode and this change, although the CPSR/SPSR is really a corollary of the change to 32bits, allows the ARM6 cell to easily handle virtual memory without the contortions you had to go through on earlier cell ARM chips.

Two new instructions for reading and writing the CPSR and SPSR registers were added. The program counter is now fully 32 bit with the CPSR being hardware shifted into position when the PC is read in 26 bit modes. (for backwards compatibility.) The ARM6 cell is fully binary compatible, in the 26 bit modes, with the earlier ARM cell's code. The chip is fully static, the clock can slowed to any speed and the processor will maintain state. Finally the cell can work in either big-endian or little endian operation can be hardware switched between the two modes. Total register count in the ARM6 cell (not chip) is 36,000 transistors.

Several versions of the ARM6 cell have been produced. The ARM61 is a hardwired version of the ARM6 cell in ARM2/3 compatibility mode. This chip cannot enter the 32bit address/processor modes. The ARM600 range of chips is an ARM6 cell with an inbuilt MMU, on chip cache similar to the ARM3 chip's, an eight deep write back buffer with two independent addresses and a total transistor count of 360,000. The cache has had performance tweaks, is now controlled by the MMU and has been adjusted for 32 bit addressing. Three ARM610 chip speeds have been produced. One at 20 MHz delivering 17 MIPS, one at 30 MHz delivering 26 MIPS performance and finally one at 33MHz giving around 27-28 MIPS.

Also available are the ARM60 (an ARM 6 cell as a chip, without anything else.), ARM650 (An ARM6 with some RAM & peripheral controllers. Designed For embedded control systems.), ARM6l (lower power ARM6 cell) and the ARM60l (lower power version of the ARM 6 cell as a chip.).

4. ARM 7:

The ARM7 cell is functionally identical to the ARM6 cell in capabilities but may be clocked faster than the ARM6. A variant of the ARM7 cell offers an improved hardware multiply, suitable for DSP work.

Most of what is new in the ARM7 cell is internal changes on timings for various signals. The ARM700 chip has a larger on chip cache (8kb, and radically altered for power efficiency) to the ARM600, improving cache hit rates. It also has twice the number of translation lookaside entries in the MMU and twice the number of address on the write buffer. (Presumably now four address can be written to before the buffer stalls.) At 40MHz the ARM710 delivers about 36 MIPS, or around a 40% improvement over the ARM610.

ARM7 series devices are ARM7 (chip cell core.), ARM7D (the chip core with debugging support.), ARM7DM ( an ARM7D with an enhanced multiply.), ARM7DMI (an ARM7DM with ICEbreaker (tm). ICEbreaker is on chip support for In-Circuit-Emulation.), ARM70DM (ARM7DMI as a chip.), ARM700 (ARM7 + MMU + cache + Writeback Buffer.) and the ARM7500 (ARM7 + MMU + cache + Writeback Buffer + IOMD + VIDC20). Nearly all of these cores can be offered with the Thumb core as well.

5. ARM 8:

The ARM8 cell is directly compatible with the ARM6 and 7 devices. However it includes a five stage pipeline (an idea duplicated in the StrongARM device), a speculative instruction fetcher and internal tweaks to the processor to allow a higher clock speed. The cache remains the same size but becomes a writeback cache as well and a 64bit multiply instruction added.

Fabricated on 0.5 micron process the chip is listed as delivering 80 MIPS performance with a 3.3 Volt device at 80 MHz. This is over twice the performance of an ARM7 chip and lives up to the initial 'roadmap' promises made about the ARM family. However it's performance is eclipsed by the StrongARM devices for raw processing power.

6. STRONGARM :

This is the high speed variant of the ARM chip family, having been developed by Armltd in conjunction with Digital. Architecturally it is similar to the ARM8 core, sharing the five stage pipeline with that processor. A further difference is change from a unified data and instruction cache to a split, Harvard architecture, instruction and data cache. Each cache is 16kb in the SA110.

In terms of the instruction set there is one new instructions added, the halfword load/store for moving 16 bit data units. Complete code compatibility is not guaranteed with earlier processors because of two factors, The extended pipeline means stack calls that store the Program Counter will have a value of the PC a full sixteen bytes ahead of the currently executing instruction, rather than the more normal eight bytes. Secondly the split cache introduces problems with self modifying code being first executed, then treated as data, manipulated and an attempt is then made to execute the altered code before it is flushed from the instruction cache.

Such code fragments will break. Fortunately such code tends to be fairly rare and confined to the OS (SWI handlers in particular). Produced on a 0.35 micron process the SA110 part achieves 115 MIPS at 100 MHz, 185 MIPS at 160 MHz and 230 MIPS at 200 MHz. The SA1100 part is designed for portable applications and contains an SA core, MMU, read/write buffers (probably a Level 1 cache and write buffer akin to the SA110 part), PCMCIA support, colour/greyscale LCD controller and general purpose IO controller (including two serial ports and USB support). It can be clocked at 133 or 200 MHz and consumes less than 500 mW of power.

7. ARM 9 :

An incremental improvement over the ARM8 this chip features the same five stage pipeline but is now a Harvard Architecture chip, like the StrongARM. This probably means that same restrictions on self modifying code apply as for the StrongARM.

It is initially going to be offered as two parts, the ARM9TDMI (Thumb, Debug support, 64bit Mulitply and ICEBreaker In Circuit Emulation) - which is the base core part, and the ARM940T. The ARM940T offers, above and beyond he base core, 4kb Instruction/Data caches, a write buffer (8 words, 4 independant addresses), AMBA bus interface, external co-processor support and a protection unit for embedded applications (requires no address translation and allows eight, independantly sized and level of protection, protected areas of memory). Both parts are fabricated at 0.35 microns, clock at 150 MHz (producing 165 MIPS) with the ARM9TDMI consuming 225 mW and the ARM950T 675mW.

8. ARM 10 :

Designed to be fabricated on 0.25 and 0.18 processes it is meant to function at 300 MHz giving 400 MIPS performance while consuming less than 600 mWatts of power. As well a companion development of the core is a Vector Floating Point unit (VFP10) delivering 600 MFLOPS, at 300 MHz, and designed to be used by the ARM10. New features in the core include branch prediction, parallel instruction execution (but curiously it is not full super scalarity, presumably the trick is multiple executions of the same pipeline stage are now possible if the instructions are independant of each other.) and some method of continuing instruction execution on cache misses. (perhaps this is only for Data Cache misses seeing as the new processor appears to be a harvard architecture like the StrongARM processor)

Initially planned versions include the ARM10TDMI core with the ARM1020T processor built around this core but adding an MMU with demand paged virtual memopry support, a 32Kb harvard style level 1 cache (most likely 16Kb Instruction and 16Kb Data caches ala the StrongARM), write buffer and an enhanced AMBA bus interface. Exact power consumption figures haven't been released but I expect the ARM1020T will consume between 0.6 to 1 Watt worth of power at 300 MHz.

ARM APPLICATIONS:-

1. ARM seems to be leading the way in this field of processing. The processor has found this as one of its greatest niche markets, mainly because of the steps the company has taken to fit into the embedded market and the architecture it has adopted. DSP is prevalent with embedded processor in cell phones, cordless phones, base stations, pagers, modems, Smartphones and PDAs (Personal Digital or Data Assistants). Other embedded applications that take advantage of such processors are: disc drive controllers, automotive engine control and management systems, digital auto surround sound, TV-top boxes and internet appliances. Other products are still being modified to take advantage of it: toys, watches, etc. The possible applications are almost endless.

2. As the terminology has shifted around ultra mobile PCs, mobile internet devices (MIDs) and net books, some may have made the assumption that any given PC would be powered by either an Intel processor or one designed by ARM Holdings plc (Cambridge, England) and implemented by one of its semiconductor licenses.

3. In turns out that assumption is too simplistic. Some PC makers are using an Intel processor to run the Windows operating system but finding ways to let an ARM processor run some key applications for the sake of its power efficiency and the resulting longer battery life for the computer.

4. Warren East, president and CEO of ARM, told analysts at a meeting held to discuss the company's financial results that his company's collaborative work with software vendors to port specific applications to ARM's processor cores was yielding results in both mobile internet devices and in PCs

5. Much to the chagrin of the largest high-tech companies whose products have served as the foundation for computing for the past 30 years, the microprocessor is breaking free of the chains that bind it: overweight operating systems, the need for heavyweight batteries, and the requirement for lots of room with a built-in fan or two to keep devices cool enough to operate.

6. ARM is a microprocessor manufacturer that is taking advantage of advancing technology's steady destruction of those chains forged by the likes of AMD, Intel, and Microsoft.

7. Here're a few facts:

• ARM is the processor used in both the Apple iPhone and Amazon's Kindle.

• ARM shipped its 10 billionth processor this year and in 2008 alone shipped 2.5 billion processors.

• ARM chips are in about 98 percent of all cell phones out today.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download