A Policy for Government Support - Gordon Bell



A Policy for Government Support

of Computer Systems R & D:

A Look at 50 Federally Funded

Computer Systems Research Projects Over 30 Years

Gordon Bell

April 1994,

(with additions February 22, 1995)

450 Old Oak Court

Los Altos, CA 94022

415 949 2735 phone/fax

gbell@mojave.stanford.edu

A Policy for Government Support of Computer Systems R & D:

A Look at 50 Federally Funded Projects Over 30 Years

Gordon Bell

Abstract

The federal government has played a most significant role in computer system development, including minicomputers, workstations, RISC architecture, computer networks and over the last decade, parallel computers. It is important to understand the funding the mechanisms that form or detract from healthy computer structures and lasting industries. These heuristics, based on my 30 year recollection of roughly 50 computer hardware systems activities are offered to policy makers and funders:

1. Demand side works i.e., "we need this product/technology in order to accomplish x"; supply side doesn't work based on a "Field of Dreams", build it and they will come.

2. Direct funding of university research resulting in technology and product prototypes that is carried over to startup a company is the most effective -- provided the right person & team are backed with an avenue for technology transfer.

a. One researcher, Forest Baskett, executive VP at SGI should be encouraged to return to Stanford because he was very successful (SGI, SUN, MIPS).

b. Transfer of technology, except trained people, to large companies for new or existing projects has not been effective. No really successful transfers are known.

c. Government labs rarely produce products or create companies unless by accident.

3. A demanding and tolerant customer or user who "buys" or demands products works best to influence and evolve products (e.g., CDC, Cray, DEC, IBM, SGI, SUN, TMC).

a. DOE's labs have been effective buyers and influencers as significant users, i.e., the "Fernbach policy" at Livermore and Los Alamos that created the supercomputing industry.

b. Universities influenced timesharing, graphics, workstations, AI workstations, etc. through purchase, co-development, use, and product evolution.

c. Although a major successful funder in the past, (D)ARPA, has been less successful in the last decade of parallel processing, partially because of the scale, difficulty, and lack of a driving need from its computer science university contractors -- hence are unlikely to be helpful as users in the trek to the teraflop.

4. Direct funding of large scale projects to a sole source vendor is risky in outcome, training, and technology transfer. Did BBN or ARPAnet help or defer the establishment of comptuer networking and a network industry? What about Internet?

5. HPCC funded product development, targeted purchases, and other subsidies to establish "State Computer Companies" in a vibrant and overcrowded market is wasteful, likely to be wrong , likely to impede computer development, (e.g. by having to feed an overpopulated industry) that by its nature is likely to do the right thing. Too much funding created many experimental FORTRAN dialects that impeded ISV apps. It is also likely to have a deleterious effect on a healthy industry e.g. supercomputing. February 2, 1995 Fujitsu announced a computer using CMOS vector processors operating at 2 Gflops. The U.S. focus on multiple commodity micros that have comparatively poor vector speed shows how wrong an academic community can be.

6. "University-Company collaboration is a new area of government R&D. So far it hasn't worked nor is it likely to unless the company co-invests and receives no subsidy. This form of funding appears to be a way to help a company keep and fund marginal people and projects. In many cases, even if a project were to succeed, the company has no avenue to a market or is it likely to fund costly market development.

7. CRADAs or co-operative research and development agreement are very closely allied to direct product development and are also likely to be ineffective.

8. Direct funding of software apps or the porting of apps to one platform, e.g., EMI analysis is a way to subsize a marginal company. If government funds apps porting, it must be cross-platform for comparative benchmarking, understanding, and training.

9. Too many marginal machine efforts are funded! Encourage the use of computers, but discourage new designs from those who have not used, need, or built a computer.

Summary

A number of heuristics are given regarding effective funding of computer systems research and development. Given the nature of computer sytems, projects are relatively large scale costing several to tens of millions of dollars depending on whether the project builds on a design and existing computer system and infrastructure or is built from scratch in a university or a company.

Only two funding methods have been found to be effective: university research that is transferred via a startup company and the purchase of systems by knowledgeable, early adopter users to validate them and assist in their development and evolution.

Given the extensive ARPA funding to companies and the projected use of this type of funding by the Department of Commerce, it is critical to understand if the poor results presented herein will be a future predictor.

Introduction

Government agencies such as (D)ARPA, DOE, ONR, and NSF have a remarkable record of nurturing the development of new computer structures including: timesharing, graphics and image processing, supercomputing, ARPAnet and wide area net protocols, RISC processors, and parallel computing. AI resulting in rule-based and expert systems programming and speech understanding, human interfaces, and UNIX as a standard also came from ARPA-funded efforts (Bell and McNamara[1], 1991). For a score of years beginning in the early-1960s, Seymour Cray invented supercomputers, and DOE's Livermore and Los Alamos Laboratories, specified, bought and applied them creating the supercomputing industry.

In many of the successful cases of technology transfer, technology in the form of a product prototype was developed in a university. The technologist left the laboratory, got financing, and went on to develop a product. The university research, government laboratory, and leading edge industrial communities acting as early adopters bought, applied, and helped evolve the product. DEC formed in this model by buildin on technology from MIT's Lincoln Laboratory. Similarly, government supported ERA begot CDC that begot Cray Research and Cray Computer. SUN MIcrosystems came from a prototype and team from Stanford and Berkeley; and Cisco, MIPS, and SGI came from Stanford. With the availability of capital for competent teams, and the drive and skill for success, entrepreneurial startups will continue to be the main way to form lasting, successful companies. However, immortality is not guaranteed (e.g., CDC, Cray Computer, DEC).

Transfer of technology from a university to an existing company has been less successful. Cal Tech transferred the multicomputer architecture idea to Intel, with questionable success as Cal Tech's fine grain multicomputer acquired overhead to become medium grain at Intel. Patterson transferred RISC architecture to the SUN Sparc, and Katz transferred RAID to SF2/MTI. RAID was also successful because the idea in the paper was self-evident could be directly translated to hardware and software. The only reliable form of technology transfer is the transfer of people from a research project to either a startup or established company.

Direct government funding of a company to develop a product, followed by funding product purchases doesn't appear to produce healthy companies. "State Computer Companies" are most likely to impede the natural evolution of technology and product development. ARPAnet is probably the most successfully funded product development, but it can be argued that by funding BBN to develop ARPAnet impeded the wide-scale adoption of packet switching. Furthermore, "state products" create an artificial market and non-level playing field that deny privately funded companies of the early adopter market characterized by universities and government laboratories. Similarly, companies can use state subsidy, including list price sales to underwrite commercial market development including price-cutting.

Heuristics from government funded computer systems research

1. Demand side funding or any relationship that creates a "buyer"-"seller" relationship is essential for every project. "Best effort" or programs created to research knowledge, create tools, products or processes are certain to be disasterous when doing computer systems R&D. Developing computer systems is unlike open-ended research and must be judged as development instead of research.

All of us who've been involved in tools development understand this heuristic: tools built in isolation by a "tools" group are rarely used or often break; the most useful tools are by-products of an aggressive project that requires the tool!

2. "Direct funding of university" or laboratory research for technology and prototypes of products that is carried over to startup a company or as a product in an existing company is the most successful model for the development of technology and products and technology transfer. Tables 1and 2 give companies that have come from university research and companies that have relied on or required a government supported marketplace, respectively. Appendix A describes each sample and A.1 includes a snapshot of the outcome of 20, Strategic Computing Initiative research efforts that were funded in 1985.

Funding of technology development that allows a company to start up and create new products is effective (e.g., Evans and Sutherland's graphics processors, and Kuck Associates' parallelizing compilers). The many start ups that came from Stanford show the efficacy of this form of transfer.

The flow of technology to existing companies from university research is difficult and rare (one example). The flow of Cal Tech's fine grain multicomputers to Intel is a significant transfer to an existing company, even though many question the efficacy of the resulting product compared with Cal Tech's original. CMU's Warp failed to become a successful at GE, Honeywell, or Intel partially because these companies had no way to deliver and market these specialized products. The CMU Warp team and its successors were the founders of Fore Systems, the earliest startup company to introduce an ATM switch.

Although funding university research has produced enormous benefit, this approach is extremely wasteful and inefficient. Nearly every result has come from a highly focused, directed, large-scale research effort led by one or two individuals, and a cadre of "attracted" faculty and students. Funding Berkeley, CMU, MIT, and Stanford as the "DARPA Universities" just gives stability, a computing environment, and pool of students and faculty. Only ARPA has the funds to commit to "block support" of departments, long-term 20 year continuous technology research (such as speech, robotics, and vision), or the large projects that generate the break-throughs. A more thorough study might reveal interesting patterns and include the software spinoffs and training of people that are not computer systems per se.

3. Computer systems development has progressed rapidly using a "demand-side" approach -- a demanding and tolerant customer or user "buys" high tech products. Furthermore, the efforts of early users who co-developed, used, and evolved products have been critical for helping create "industrial strength" products and companies.

"ARPA purchased" faciliites at universities is an example of effective use of the "demand side" approach. Although timesharing was invented at MIT, wide-spread use at universities was the key to understanding and its evolution. ARPA funded universities to purchase machines from DEC, IBM, SDS/Xerox, and Univac. Similarly, Symbolics and SUN workstations fostered AI and distributed computing respectively. However, vendors, such as Xerox objected to the process that allowed easy purchase of "DARPA favored" machines.

The approach that Sid Fernbach, former head of computation used to help supercomputing come into existence was: specify the needs, purchase, and be a knowledgeable, demanding, tolerant, and helpful customer. This was used successfully at both Livermore and Los Alamos for supercomputers, high speed networks, large file systems, and large-scale, high performance graphics terminals.

The "Los Alamos approach" of funding its engineers to develop special devices is unlikely to create any residue accept a home-grown product to support, and is lilely to be wasteful because technology transfer is so difficult. Start-ups and products that a company acquires and builds based on a lab's technology and designs are unlikely. Few, if any, start-ups or significant technology or products have come from Los Alamos.

The "supply-side" or "build it and they will come and program it" approach characterized by DARPA's Strategic Computing Initiative (SCI) or High Performance Computer and Communications Initiative (HPCC) is much flawed and likely to fail to develop technology, products, or lasting companies. Beginning with DARPA's funding of the Illiac IV, and continuing into the SCI described in Appendix A.1, the State Computer efforts were de-coupled from any DARPA need and customers. DARPA and the computer science community academic contractors had no compelling problem that State Computers were designed to solve. The evolution of Thinking Machines' CM1 (sans floating point) as a single user computer, to finally have 64-bit floating point illustrates the groping for a use and lack of focus. A traditional startup would more likely have been right at the start.

4. "Direct funding of large scale projects" is both risky as to outcome and to long-term, training and the establishment of an industry. The early '70s example of ARPA funding BBN to develop packet switching for ARPAnet illustrates why direct product development and product purchase doesn't work very well. Perhaps the only way to get ARPAnet was funding its development by a single contractor as ARPA did. I believe that an architecture and standard that allowed many suppliers to build equipment could achieved even more impressive results. Ironically, this was exactly what happened when universities got involved in the network research.

BBN became the sole source to ARPA and the DOD for switching. It sold some switches commercially. In order to increase its prices and margins, BBN built proprietary hardware that rapidly became obsolete. BBN was a high priced producer and the market barely moved because they had a sole customer -- the military. Concurrently, DEC developed its own packet switching using ARPAnet ideas. With no ARPAnet implementers, the technology had to be re-developed before products could be designed. In the 1990s Computer Networking is the "hot" area of the computing industry and the HPCC will further expand the market. Hundreds of startups such as Cisco (coming from Stanford), Synoptics, 3COM, and Wellfleet have revenues of a billion dollars. Many communications companies are staffed with BBN and DEC alumni. BBN is a minor supplier.

5. Direct funding of product development to companies as practiced in (D)ARPA's Strategic Computing Initiative and HPCS is not a model for effectively funding computers systems development. Funding product development, targeted purchases, and other subsidies to establish "State Companies" that selectively subsidize product development in a vibrant and overcrowded market (see Tables 1-2) is wasteful, likely to be wrong, and likely to impede the development of technology by having to feed an overpopulated industry. In the early 1980s, DARPA funded roughly 20 parallel computer efforts that kicked off the Strategic Computing Initiative described in Appendix A1. None were successful. State funding of a company doesn't build a strong company, because it has a safety net, doesn't have to support itself, and is non-enduring. Furthermore, such a company can use profitable government business to subsidize the commercial part, thereby creating an unfair, and artificial market. Since State Companies have a monopoly of the leading, edge early adopter university and laboratory market, privately funded companies are denied the key to market entry.

In the case of the HPCC, the traditional, small, fragile, high development cost supercomputer market is disturbed and may be destroyed as hype about massive parallelism backed by government supported computer purchases replaces reason when buying computers. There is little evidence to support massive parallelism as a superior alternative to traditional supercomputers or clusters of workstations, etc. that would have occurred in an evolutionary fashion. In fact, the more we learn about the actual performance of State Computers, the poorer and more highly specialized they look. Their principle value is likely to be the training of students. Another benefit is software systems and apps that can be used on workstation clusters that should have beens supported.

The technical computing market has always attracted entrepreneurial engineers. In the '90s the likelihood of starting a successful parallel computing company is small because of a small, over-crowded market. Furthermore, since the government funds even more companies, it risks getting return (i.e., taxes). Competent, established companies such as Convex and IBM are working on massive parallelism using workstations to further reduce prices to the commodity level. Also, many computer companies are using ATM. This will permit coarse grain multicomputers at zero cost and have compatibility with workstations and servers that is the key to appilcations.

Parallel processing is more difficult than imagined. The number of applications that can be solved uniquely by coarse grain multicomputers is small based on the slow growth in trained personnel and the diversity of architectures and programming approaches. Taking a dusty deck from a supercomputer requires rewriting and in some cases, a new algorithm. Since each of the multicomputers has its own idiosyncrasies, like its supercomputer ancestors, program portability is limited except for embarrassingly parallel application. Thus, the ability to have the critical applications software may be the fatal flaw that limits massive parallelism.

Instead of a billion dollar market, the large number of companies with unique, massively parallel computers aimed at a Teraflop computer will probably see a market size of only a few hundred million. In 1992, this market was about $250 million. The Teraflop supercomputer costing $30 million will come in most likely in 1996-1997 based on the evolution of CMOS microprocessors governed by Moore's Law that shows 60% density and speed improvement per year.

A significantly smaller universe of computing environments is needed. In high performance computing, Cray & IBM are given; SGI is a key supplier; HP/Convex are likely to be contenders, & others e.g., DEC are trying. No State Company (intel,TMC, Tera) is likely to be profitable & hence self-sustaining even though they have received a billion in R&D contracts, co-development projects, and targetted purchases over a decade.

6. "University-Company collaboration is a new area of government R&D. It is worthwhile to encourage university and company collaboration to build and use systems. Any company should be free to work with a government funded university project or laboratory to produce technology or product prototypes. In fact companies should be especially encouraged to collaborate through tax incentives. Researchers should consider industrial partners based on ability to market the product. For example, Burroughs, GE, Honeywell, and Motorola have been especially inept partners for computing because they lack market presence, and none of these efforts have produced useful products, nor should they have been funded. Intel's iWARP is a borderline government funded university-company product since it had no market channel output.

A company should not be government-funded as part of university-company collaborative product development. If a company spends its own funds, it is more likely to take the technology and product seriously as part of its product portfolio, and not a way to recover its research costs or sluff-off or pay for marginal people or work.

7. CRADAs or co-operative research and development agreement are very closely allied to direct product development. CRADAs have the advantage of being driven by customers or laboratories that can either supply technology or buy a product. In many cases, a CRADA is a thinly disguised purchase order in which an agency gives money to a laboratory to buy a computer and fund laboratory and company product development e.g., porting of an application, and the company provides window dressing funds that would have previously been considered as a discount.

8. Funding of apps development is perhaps the most signifcant way to fund and subsidize a given computer system development because equipment purchase and use implies an endorsement. Industrial and other partners are implicitly committed to purchase the same compatible system.

The only acceptable way to fund apps development is through the porting and benchmarking of a set of computers. This facilitates the training, understanding, and use that would otherwise be missing.

9. Encourage the use of computers and discourage designs from those who have not used or built a successful computer. The world is drowning in massively parallel computers that absorb programmer time chasing PAP -- peak announced performance. Furthermore, the number of computers (and options) will increase over the next decade as Convex, IBM, Japanese, and other companies enter the market (TAble 2). Given the results (Table 1), it's unclear that many universities are qualified and needed to develop computer systems. All of the universities capable of working on systems and applications software are essential to the massive software effort implicit in massive parallelism.

Summary

A number of heuristics are given regarding effective funding of computer systems research and development. Given the nature of computer sytems, projects are relatively large scale costing several to tens of millions of dollars depending on whether the project builds on a design and existing computer system and infrastructure or is built from scratch in a university or a company.

Only two funding methods have been found to be valid: university research that is transferred via a startup company and the purchase of systems by knowledgeable, early adopter users to validate them and assist in their development and evolution.

Given the extensive ARPA funding to companies it is critical to understand if the nil results presented herein are a future predictor.

Table 1. (D)ARPA Funded Computer Systems Universities & Companies

group technology intermediate transfer co. or product transfer

Berkeley RISC SPARC chip*

RAID SF2 & many others

databases & database computers Relational, Informix, etc

timeshared computer (c1965) SDS 940

Cal Tech fine-grain multicomputers Ametek, Intel, NCUBE

worm-hole routers>DASH, etc. Seitz/Cohen startup

fine-grain multicomputers HP (in progress)

CMU C.mmp

Cm*

systolic arrays: GE, Honeywell Intel iWARP

Nectar Fore (ATM switch co.)

Columbia DADO

non-Von

Illinois ILLIAC IV/ Burroughs Kuck Assoc.

CEDAR Alliant Campus

MIT LISP Machines LMI, Symbolics

Connection Machine Thinking Machines

Monsoon (Motorola), T* Japan dataflow (Res.)

Alewife, J-machine

CTSS and MULTICS DEC PDP 6/10 GE 645

Rice parallelizing compilers Fortran D, HPF?

Stanford graphics SGI

workstations SUN

SIMD arch MasPar

RISC MIPS

DASH /SGI

Syracuse parallel apps ?

Utah graphics companies & alumni E&S

Yale super scalar Multiflow

State Computer Company Funding and Efforts

BBN MPP proto multicomputer

Cray Res. MPP proto, apps, customers DOE Lab agreement

Encore large scale multiprocessor mP MACH

Intel iPSCs, software, apps, customers iPSCs

TMC CM1,2,200, 5, software, apps, customers CM5

Tera prototype for a multi-threaded computer

*strike-through=unsuccessful, startups=plain text, established companies=bold

Table 2. Computer Systems Development by Government Purchases

Company product/technology competition market

Government assisted product purchase that were used to significantly start a product

CDC / ETA early supers IBM DOE Labs

Cray Res. supers IBM DOE Labs

DEC timesharing CDC IBM Univac SDS DARPA universities

IBM many from early computing labs, unis, commercial

SUN workstations DEC HP IBM etc. universities

Non-State Computer companies that compete with State Computers for funding

Convex/HP mini-supers labs, unis

IBM workstation clusters labs, unis

KSR scalable multiprocessor labs, unis, commercial

MasPar SIMD labs, unis

NCUBE labs, unis

International Companies in supercomputer market requiring Government support

Fujitsu supers, MPP

Hitachi supers, MPP to come

MEIKO

NEC SX, MPP to come

Parsytec Transputers

Companies that failed to compete during 80s in the super-minicomputers, mini-supercomputers, supercomputer, and massively parallel technical market: Government support shown in bold. Alliant, American Supercomputer, Ametek, AMT, Astronautics, BBN Supercomputer, Biin, CDC (independent of ETA), Cogent, Culler, Cydrome, Dennelcor, Elexsi, ETA, E&S Supercomputers, Flexible, Floating Point Systems, Gould/SEL, IPM, Key, Multiflow, Myrias, Pixar, Prisma, SAXPY, SCS, Supertek (part of Cray), Suprenum (German National effort), Stardent (Ardent+Stellar), Synapse, Vitec, Vitesse, Wavetracer.

Appendix A. 30 Government Funded Computer and Parallel Processing Efforts

1. Berkeley's UNIX and RISC. Work on UNIX resulted in UNIX standards and ultimately in SUN's operating system. The RISC microprocessor prototype resulted in the SUN's Sparc Architecture. No government involvement occurred outside of funding university research. RAID came out of funding Berkeley. The most useful aspect outside of the early papers, was the reduction to practice by various companies including Katz at SF2 that Katz. Berkeley, with RAID, Sparc, and Unix should be the archetype for computer systems research and development!

In the mid 1960s, Berkeley built an early timesharing computer based on the SDS 930 that SDS sold successfully. SDS went on to build its own unsuccessful timesharing systems . Berkeley built other computers, including Baskin's unsuccessful multiprocessor and various Prolog computers.

2. BBN Supercomputers. The company was initially funded by DARPA to develop a multicomputer for packet switching, and then to develop a successor, Monarch. The initial computer worked for communications, but the company aspired to reach the general market, although they did not understand the market need. When DARPA withdrew support, the company folded. Ironically, the Intel and TMC computers are really just variants of BBN's first multicomputers.

3. Cal Tech Multicomputers. In the best university tradition NSF and DOE funded Cal Tech to build the first hypercube multicomputer with some help from Intel. Geoffrey Fox mobilized Cal Tech users with real problems to exploit the multicomputer. DARPA became involved in the early 1980s when it became clear the multicomputer provided the easiest path to high peak performance. Intel and Ametek (defunct) company started building computers based on the multicomputer idea and license. NCUBE also started up to build multicomputers.

Intel's first Hypercube was an adaptation of Seitz' machine using low performance processors, higher communication overhead, and longer latency resulting in the creation of poor, medium grain multicomputers. Intel established an independent business that came from the unsuccessul 432 and Biin system groups, to get into the supercomputer business. In its second generation, Intel used their i860. However, the i860 is not Intel's mainline development, and the advantage of a commodity chip that could attract software is lost. It's unclear whether Intel's decision to enter the supercomputer market was based on government support for development and purchase of new computers. If this had been a strictly profit making venture, Intel might have used another architecture and not used the government to support a marginal effort. If the market evolves to become main line, it's unclear how significant Intel will be, unless HPCC continues to provide significant development contracts and purchase orders.

From 1991-1994 Intel's Delta, a prototype for the Paragon shipped in 1993, has operated at Cal Tech to serve < 200 users. System availability is ≈ 90% and operates in debug mode (a dozen users share the machine), and production mode (1-4 users). Programs run at 1-5 Gigaflops depending on whether they can operate within the limited I/O and file system. All together, the entire load could easily be satisfied by 2-4 processors of a 16 processor Cray C90, and users would have been spared the inconvenience of rewriting programs and a prototype.

NCUBE was started by Steve Colley, a Cal Tech graduate, without funding or the Cal Tech license. NCUBE developed it architecture, designed its own microprocessor, and software independently. With increased competition NCUBE was sold to Larry Ellison, Oracle's founder, for commercial applications. NCUBE has worked to get a fair and open market. In 1994, NCUBE is designing a computer for video-on-demand and parallel databases.

Seitz continued to research the fine grain multicomputer design by developing processors that have high communication bandwidths and the ability to communicate short messages among the processors without extensive overhead and long latency. These creative processors offer about 100 times the density of the commercial counterparts his work spawned at Intel. Seitz, (1993), has stated:

"I believe that the commercial, medium-grain multicomputes aimed at ultra-supercomputer performance have adopted a realtively unprofitable scaling track and are doomed to extinction. With their relatively modest number of nodes, they may, as Gordon Bell believes, be displaced over the next several years by shared memory multiprocessors. For the loosely coupled computations on which they excel, ultra-super multicomputers will, in any case, be more economically implemented as nethworks of high-performance workstations connected by high-bandwidth, local-area networks such as ATOMIC."

In 1994, Seitz and Cohen left acadmia to found a startup that would use fine grain multicomputer for a high speed network.

4. CDC. Since its founding, CDC excelled at selling computers to our national laboratories, including getting computer purchases written into law. This started when large computers were very expensive, but continued on as a competitive weapon after Seymour Cray left CDC in 1972, against Cray Research. In 1992 CDC has no competitive products, but resells other companies products.

In the early 1980s, CDC spawned ETA to build a competitive supercomputer based on their STAR and 205 memory-to-memory vector architecure. The ETA 10 offered a peak of 11 Gigaflops in 1988 and would most likely have won the Bell Prize in 1989 and 1990 had ETA lived! Because it had poor scalar performance, just like the "State Computers", it was unable to compete in the general purpose market by being able to run all types of jobs from scalar to embarrasingly parallel. Also, ETA didn't understand the importance of software or how to build it, and didn't move to port UNIX until it was too late.

5. CMU C.mmp and Cm*. In the 1970s DARPA funded the research to build two multiprocessors, while DEC provided much of the hardware as a partner. Several C.mmp alumni went to Intel and made the same architecture mistakes with the 432 that spent over a billion dollars. Cm* was the first scalable multiprocessor, and the ideas influenced both the Encore research and KSR computer. Both projects generated alumni (e.g. Andy Bechtolsheim at Sun, Fred Pollack at Intel) who build important computers. The Cm* project had a high payoff in terms of tools, knowledge, and training per $ spent, but there was no direct product or company output.

6. CMU Systolic Processing (Warp) with GE/Honeywell and Intel (iWarp). The first Warps were brought to the market by GE and Honeywell, but since neither company had market presence, the most likely benefit was government support for an industrial laboratory. In the case of iWarp, Intel took too long to get a weak product to market. If iWarp had been a carried out as a commercial product, most likely it would have performed far better and leveraged other work. It's fairly clear that Warp is not a useful or successful technology, product, and project. However it may have helped Intel learn how to design better switches for its multicomputers. The Warp is fundamentally the wrong architecture and completely unable to compete with the aggressive builders of commodity Digital Signal Processor (DSP) companies in a flourishing market!

7. CMU Nectar and Fore System's ATM Switch. In the late 80s Kung had switched interest to switch design in order to connect various systolic and other processors. Nectar was the basis of a HIPPI switch that was done in collaboration with Network Systems Corp., the company that developed the Hyperchannel, a 50 Mb/s network bus for supercomputers.

The team that "trained" by building Nectar went on to found Fore Systems and deliver the first ATM switches. In 1994 this Pittsburgh startup is profitable, growing rapidly, and provides leading edge switches for a range of early adopters.

8. Columbia's DADO and Non-Von. Nothing. These efforts taught some graduate students and faculty that parallelism is hard, and building systems is non-trivial.

9. Cray Research. Cray was founded by in 1972 by Seymour Cray to build supercomputers. The Cray 1 established the first guidepost for supercomputer by establishing the vector architecture. The Cray XMP introduced in 1982 introduced multiprocessing to increase both peak performance and throughput. This line of vector multiprocessors with dense ECL packaging, the fastest clocks, and pipelined processing are most likely the mainstay and blueprint for general purpose supercomputing until 2000. IBM, and all the Japanese supercomputer manufacturers (Fujitsu, Hitachi, and NEC) all use the same basic design. However, NEC has excelled at providing more peak power by using multiple pipelines.

Cray has jumped on the "use commodity CMOS microprocessors (DEC's Alpha) and interconnect lots of them" bandwagon. Cray's ability to interconnect many computing elements with low latency, makes it a serious "State Computer" supplier. Cray's major competitor (Fujitsu) is leveraging vector technology to provide a multicomputer that provides an order of magnitude more power per node than the Cray MPP, thus, Cray may have deluded itself in taking the wrong path with "free money". Has DARPA funded Cray to develop the wrong computers?

10. DEC PDP-6/10 and timesharing. In 1963, I led the development of the PDP-6 timesharing system based on MIT's CTSS. About 20 were sold, and in the 1970s, its successor, the PDP-10 became the computer that most ARPA sites bought for their computing. Researchers, with ARPA $s also bought the IBM 360/67, GE 645, and SDS/XDS systems. No one told a university which machine to buy. The PDP-10 (KA 10) was profitable and DEC invested in subsequent successors (KI 10, and KL10). The PDP-10 line was discontinued when the engineering team failed to produce a competitive successor (akin to situation at CDC). DEC got software, trained people, and direction from the ARPA users. BBN's Tenex prototype funded by ARPA and people came to DEC to implement the DECSystem 20 system. Just like CDC, Cray, Sun, and even Symbolics (even though they ultitmately failed) the relationship of DEC as a supplier to government funded university and lab users appears to be the best way to develop products, market, and company/business.

11. Encore Ultra/Gigamax. I proposed a DARPA project to build a multiprocessor with several hundred processors in a two-level hierarchy after founding Encore, based on our 20 processor multiprocessor. The rationale was that startups were efficient and competent. Henry Burkhardt and I were forced to leave the company, and no prototype was delivered, although the group continued and built Multiprocessor MACH. Encore was able to get significant sales through its DARPA connection In 1992, the Encore software group formed the Center for High Performance Computing as part of Worcester Polytechnic Institute. The Encore project illustrates the importance of continuous project review and that funding must be to a PI and not a place.

12. Floating Point Systems. In 1987, FPS got Senator Mark Hatfield to pass a law forcing DARPA to purchase of their transputer-based multicomputer. The machine was delivered, but may have never worked. In the downsizing, FPS got out of this business, leaving dead-end hardware for graduate students at Cornell and Syracuse to disect or resurrect. In 1994 FPS is part of Cray, building SPARC multis.

13. Illinois CEDAR Center for Supercomputing Research and Development. DOE and NSF funding Dave Kuck's laboratory to build CEDAR, a hierarchical multiprocessor based on defunct Alliant supplied hardware. This prolific group has produced more compilers, experiments, benchmarks (The Perfect Club) and understanding about supercomputing than any other laboratory. Kuck Associates formed to deliver compilers using lab developed technology. Illinois is one of the few sources of fundamental parallel compiler technology.

Craig Mundie, a founder and CEO when Alliant went into Chapter 11 (summer 1992) believes that Alliant might have been successful by being able to get additional capital, but that DARPA, Intel, and Thinking Machines created MPP hysteria and succeeded in "Osborning" the supercomputer industry. Ironically, Intel and TMC "Osborned" by a full year and would probably have collapsed without government support buying non-functional machines! MPP hysteria also caused Convex and Cray "to Osborne" and pre-announce their machines by about 1 year.

14. Illinois Illiac IV. In the late 1960s Dan Slotnick proposed the Illiac IV and its saga is well documented. DARPA funded Burroughs to build it against the advice of the DARPA peer community, including my own concerns about the design team and circuit technology. Burroughs, was never in the technical market place nor understood it, and used the contract to fund its Paoli Pa. organization. The machine never worked well, although it was a precursor to TMC's SIMDs. The main output was Kuck's parallel compiler. None of DARPA's user-researchers wanted or needed Illiac IV for their own work, a situation quite similar to HPCC!

15. Lawrence Berkeley, Livemore and Los Alamos Laboratories. These labs are funded by the Department of Energy. Over many computer generations beginning with LARC and Stretch, Livermore and Los Alamos laboratories have played a key role in the development of supercomputing by being the leading edge buyer of first computers. Each lab operated differently about creating new non-computer products such as fast networks, mass storage systems, and multiple high speed terminals.

LBL's foray into computing has been minimal. It attempted to build one unsuccessful multiprocessor based on Modcomp processors.

Livermore: Be the Most Demanding Customer. The late Sid Fernbach who headed computation at Livermore was a demanding but patient and helping user of the first supercomputers. Livermore specifies a need, and then gets companies to bid a product solution -- usually the first one off a line. It works with the company to make the product work in the Livermore environment, and when done, the company has a tested, marketable product. This customer-driven mode is still in operation in the 1990s. Ironically, Sid was pressured to resign as head because he had chosen CDC's ill-fated STAR.

Lowell Wood's group at Livermore worked on the development of several multiprocessor computers, two models of its S-1 machine were built, but the only real output was a method of interconnecting chips on a substrate resulting in the startup company NCHIP.

Los Alamos: Build something and someone may come and exploit it. Los Alamos engineers have a history, beginning in the late 1940s with the Maniac, of designing and building one-of-a-kind prototypes to solve a problem. Next, they attempt to get a company to adopt, build, and market the product... an un-American act for a not-invented-here culture. Behaving in this fashion, Los Alamos needs a large staff, but the infusion of technology into the industrial sector is nil. Los Alamos has created only a few start-up companies that use or exploit laboratory technology! Its greatest accomplishment, the HIPPI standard is a minor adaptation of the Cray connection. HIPPI took a decade to evolve because the bottom-up design approach failed to recognize the needs of such a standard, including description, testing, protocols, and evolution for increased distance, and new technology.

Los Alamos has also been successful at influencing computer development with purchases of Stretch, the early Cray computers, and Thinking Machines.

16. Michigan MTSS and the IBM 360/67. Early timesharing at Berkeley, MIchigan, and MIT was supported by ARPA. MIchigan built one of the earliest systems.

17. MIT timesharing. MIT influenced timeshared computers by building one of the first ones based on the IBM 7090 that influenced DEC's PDP 10 line. It's subsequent computer Multics was directly transfered to GE/Honeywell and was influenced IBM's timesharing efforts.

18. MIT's Decades of Dataflow culminated with Motorola's 1991 delivery of Monsoon. From a performance standpoint, the computer is underwhelming, and hence was abandoned. In 1994 the Dataflow group is designing T* (or re-T*), a multithreaded processor to recover the latency that is inherent in a distributed multicomputer ... a variant on the Teracomputer. Since Motorola is not in the scientific market, it's hard to believe computers built in this fashion will be successfulr or even see a market. It's unclear whether Motorola was paid to produce the prototypes. However, taxpayers need to understand why a commercial company should receive government funds for co-developing either research prototypes or products with a university.

Several Japanese research laboratories including ETL and NEC have built high performance Dataflow machines. There are no commercial dataflow computers.

19. MIT Connection Machine and Thinking Machines. Founded in 1983, Thinking Machines followed the traditional "do the proto at a university and then start a company" model, except that DARPA was a major funding partner for the company and stimulated subsequent machine purchases. Also, various contracts have been let to develop software such as compilers for the Connection Machine. Agencies bought copies with DARPA's help, and universities bought computers with conract money. A laboratory that might have bought another Cray, got a Connection Machine for large applications, and reduced the need for another supercomputer.

Everyone of us who have built a supercomputer (i.e. Alliant, Ardent, CDC, Cray, Convex, Fujitsu, Hitachi, IBM, Meiko, NEC, Stellar, Tera) knows that, like waffles, the first one is likely to be thrown away since each of our first designs had flaws. Thinking Machines took three tries (CM1, CM2-32 bits, and CM-200 with 64-bits) before TMC got its first Connection Machine right even for limited use, and the number delivered in their largest year was about five, full-scale 2000 PE machines or 10,000 Weitek chips worth of floating point (≈ 100 Glops) -- or roughly a month's worth of floating-point chips for workstations. Except in a few cases, a 512 PE (1/4 size) CM200 performs and does the work of one processor on a Cray YMP. Though CM2 was fundamentally flawed as an economically viable general purpose supercomputer for a variety of reasons, it was a fine niche product to be part of a supercomputing center to run very large scale, highly and embarrassingly parallel, jobs. In effect, it is a one experiment central facility such as wind tunnel, weather modeller, etc.

The CM5, prematurely introduced in 1991, is a completely different approach to the teraflop from SIMD, not based on its predecessors. TMC has to learn more about more general purpose computing, even though the SIMD nature of the machine (a few Sun Servers that connect to and control a collection of 32 to1K processors that control 4 proprietary floating point units) vaguely resembles a CM2.

TMC's CEO, Sheryl Handler and Chief Scientist, Danny HIllis, have been effective sales persons in Washington to every agency and senator, including involvement with Vice President Gore. However, responsible companies don't pre-announce computers by one year before delivery -- the result is usually financially devastating (The Osborne Syndrome). Even with government sponsored purchases to fuel the "top line" in various ways (purchases, R&D contracts, and contracts from other DARPA contractors) the company has not been profitable since its start. It is unlikely to pay back its government and private investors. Overall, the problem isn't necessarily having a limited market, special purpose, computer, the people and their aspirations. Too much funding and a funding safety net induces brain damage and allows poor management!

TMC will be successful as HPCC goes forward provided government keeps buying machines at a faster rate than the company hires. TMC gets substantial revenue when industrial labs such as Los Alamos, encourage industrial partners e.g., Mobil to buy machines in order to collaborate and share software. A government bail-out purchase of a $250M CM5 Ultracomputer would be a tremendous waste in light of the many alternative investments and the fact that such a machine is hardly likely to deliver many operations.

In 1994, Ms. Handler was replaced as CEO from a Washington D.C, law firm. The company is looking for strategic development to help fund its continuation and leverage product development and/or distribution.

20. MIT LISP Machine to spinoff LISP Machines and Symbolics. These computers follow the traditional model. Fund a project in a university and start a company to build it. At most, the world needed one company. Both companies were "done in" because plain old computers evolved faster than their special purpose designs. In the early '80s Tom Knight, one of its developers and a Symbolics founder, promised DARPA a factor of 50X performance in their next generation by the mid-80s. It's unclear whether Symbolics was funded to develop such hardware, but in no case was it able to deliver a signifcantly faster LISP computer.

LISP Machines went out of business. TI was a sub-contractor and also made and sold the machine, but a special computer did not compete with the fast moving, workstations that used general purpose microprocessors.

21. 1994: MIT Trilogy of Multiprocessors. In the early '90s the Laboratory for Computer Science has three projects. The MIT machines (J machine, Alewiffe, *T), are an example of designing computers with little use understanding and inability to focus resources to build a single successful machine. The effort and cost to build an operating computer and get meaningful results can be quite large, and must fit with the vision, talent, and resources of the research team. The J machine appears to be the only interesting or useful output other than training. This is due in part, to Daly's Cal Tech training about building experimental systems.

22. University of North Carolina Pixel Planes. Fuchs has built a number of SIMD systems for graphics. Although I do not know how this work has affected graphics architectures generally, Fuchs and the architecture was the basis of Stellar's (c1986-1990) graphic supercomputer and the notion of a virtual pixel buffer.

23. Purdue parallel computer. This NSF funded effort produced a working computer, but results outside of training were nil.

24. Rice University Parallel Compilers[2]. This effort has been funded at a low level by grants by NSF, IBM, etc. until DARPA was forced to get a compiler for its own machines. Many of the key ideas and people writing compilers have come from Rice. Rice may have the highest output and cost-performance of any university compiler research. The new FORTRAN dialect, HPF may not be needed at all, but is an artifact of MPP architectures and the directives introduced by Rice in the Fortran D work. Thus, a by-product for the search for massive parallelism is massive work, a new language, and the likelihood of having to train a plethora of low-level "machine-language" programmers that use Fortran Syntax to write pipeline microcode in order to get a single program to run on a particular computer and even a particular configuration. All of this will certainly detract from training, economical use, the porting of applications, and the focus on compilers that deals with idosyncratic computers.

25. Stanford: Tom Blank's SIMD architecture the basis of start-up, MasPar. MasPar's computers are configured in the $100K - $1M range as servers to other computers, and comparable in performance to a CM2. It has an excellent reputation as a workhorse that can accept supercomputer codes on because it has a fast interconnection network allowing it to "look" very much like a traditional supercomputer. The aberration correction calculations for the Hubbell telescope that the HPCC brochure cites as a success of massive parallelism was done on a MasPar computer.

MasPar should be a success story where government funded university research that ended in a successful product and company... however, State Computers and a non-level playing field increases sales costs and inhibit their natural use.

26. Stanford: MIPS, SGI, SUN Microsystems. These three DARPA-supported efforts by Hennessy, Clark, and Baskett serve as the best model for university development, followed by start-up company exploitation. Ironically, this came from a single contract administered by Baskett.

27. Stanford: DASH and DASH follow-on. DARPA funded the research and SGI provided much of the hardware. This is an ideal university project because DASH uses existing SGI hardware. The project can go on to the hard parallelizing compiler problem. The scaling is fine since the design goes to 64 processors and the nodes are slow relative to communication. SGI might exploit the technology if it can be shown to be applicable as the speed of individual nodes increase. However, a 16 processor version still only achieves the performance of Ardent's 4 vector processor Titan that was introduced in 1988. Thus, if you want lots of megaflops, start with lots of megaflops in each node.

Stanford (John Hennessy) has a design target for a 1K node multiprocessor for which funding is being sought in order to have a product. Based on what was most likely learned from DASH, a research-product computer of this type would at most equal KSR's present computer, and most likely need some of their patents. Hennessy has suggested that various companies might build the computer. The fact that a special microprocessor architecture is required is another indication that plain old, killer, CMOS microprocessors don't work for scalable computers. It would be imprudent for DARPA to fund a company to build such a university-designed computer ... if the machine is good, a company should be encouraged to fund and build it with Stanford. Ironically, Intel, with ARPA funding may be a key silicon provider and partner in the DASH follow-on.

28. Stanford & Lawrence Livermore S-1 & Logic Timing Verifier. Two PhD students, Tom McWilliams and Curt Widdoes worked on the design of a large scale multiprocessor project. The principle technical output of the design was the first Timing Verifier, Scald. In the early 1980s, this became a key component of a design system that launched Valid Logic, one of the first three startup CAE companies. The S-1 became a project at Livermore, and a prototype was built that validated the design system under Lowell Wood. A more ambitious full-scale multiprocessor was attempted and one processor was built.

A related packaging project for the S-1 for multichip modules was the basis of the startup company NCHIP.

29. Syracuse Center for Parallel Processing. This center was legislated for DARPA to fund. The center is credible based on hiring Geoffrey Fox to form a strong group. Sometimes the Bureaucrat's (or Politician's) Golden Rule works: "Regardless of national policies, laws, and a compact of peers, researchers choose to go their own way, and they chose to go where their is money to support them."

30. Tera Computer is DARPA and SRC (NSA) funded. This State Computer startup came from an SRC design for HEP II, which in turn was based on a failed supercomputer startup that built the HEP computer. Thus, like DARPA, SRC's ego is at stake in what is really a moral conflict of interest. Tera could obtain no funding for such a computer through regular financial channels, but is being funded to design a computer. The design is reasonable, but why should the government invest in a team who has never built a successful computer? Given the long stream latency, scalar performance will be poor (e.g. 300 Mhz/15 or 20 Mhz. per stream). Thus, while Tera may be a fine massively parallel computer, it is not a general purpose computer by any definition, including its architect, Burton Smith.

31. Utah and Graphics. The Evans & Sutherland connection to start graphics at Utah, a company to exploit the research, and the training of almost the entire field of engineers and scientists who built the graphics and imaging industry, is undoubtedly DARPA's greatest success. Note DARPA funded no product developments at companies. Much of the work was done on a PDP-10, and they built minimal hardware. What is it about this experience, other than the time/need/problem and the obvious people, people, people that made such a success?

32. Yale, ELI (Extra Long Instruction computer), and the Multiflow Company. Professor Fisher was funded by government grants to evolve the FPS compiler and architecture in order to be able to execute a large number of operations in a single instruction wide instruction (7 - 28 operations per instruction). Fisher's architecture is a vector processor that is microprogrammed by the compiler. The machine resulting from the research at now defunct Multiflow had very good characteristics because it operated effectively on traditional vector supercomputer codes, and in addition could achieve higher performance on scalar code by executing a few operations in parallel.

The most likely follow-on will be an HP product since both Fisher and Rau, Cydrome founder, are researchers.

Appendix A.1 1985 DARPA Strategic Computing Initiative (SCI) Program

We can look at the effectiveness of DARPA's ability to select and manage computer development by reviewing the SCI, June 1985-1994. The following companies/organizations attended a contractor's meeting to discuss their proposed approaches to high performance computers designs: CMU Warp (with GE and Honeywell), ESL, IBM (RP3), Encore, BBN Labs, Princeton (MMMP), GE (another Connection Machine), Thinking Machines (Connection Machine), Hughes (Dataflow machine), SRI (Eazyflow), Schlumberger (FAIM-1), ATT/Columbia (Non Von), Bell Labs/Columbia (DADO), SDC/Burroughs, MIT Lincoln Labs, CMU (Production Systems), Georgia Tech, Harris/MIT, Hughes, University of Texas, MIT (Dataflow). Other efforts were subsequently funded, some efforts e.g., Motorola, Thinking Machines were continued, and various efforts were dropped or transformed into other projects or laboratories.

The results of this score of machine R&D efforts: about 1/2 were funded; 3 or 4 succeeded in building a computer and measuring it; and in 1994, exactly no computers or the ideas surrounding these efforts survived! Training appears to be the only output of this particular funding snapshot, after ten years.

Appendix B. Non Government funded and foreign efforts

1. Convex used the fast, HP CMOS micros to build a scalable multiprocessor. It has an agreement to collaborate with HP in technical computing and is likely to succeed because it is fully compatible with HP workstations and multis.

2. Fujitsu VPP500 - 200 nodes with 1.5 Gflops per node, priced competitively at 4000 flops/$. This computer is a threat to other efforts because a node is source code compatible with Fujitsu's successful VP-series supercomputers that have evolved over the last decade. Fujitsu has an impressive array of apps that run on the nodes. The most direct effect will be to unplug the Crays that Japanese car manufacturers use for crash simulation and finite element modelling. Depending on how the computer is marketed, both Convex and Cray are threatened.

3. IBM is predicating its future on both its mainframe vector units, and by interconnecting many high performance CMOS microprocessor-based workstations. By 1994 it has introduced the SP-series using the RS 6000. IBM has done more work than any company with a variety of special and general purpose parallel computers including GF11 (for QCD), RP3, and the Yorktown Simulation Engine.

4. Kendall Square Research, KSR 1&2. I believe the KSR scalable multiprocessor is the first break-through in high performance computers since the Cray vector and multiprocessor architectures. DARPA's research, including Stanford, aimed at building a comparable multiprocessor, indicates its significance. All subsequent State Computers use many, but not all of the concepts in the KSR architecture.

5. Meiko's CS-2 began to deliver in early 1994. This computer uses the fastest Sparc microprocessors, and true vector processing units that operate at 108 Mflops (64 bits) or 216 Mflops (32 bits). The switching network, like IBM's network is high bandwidth (≈200 Mbytes/sec), low latency (≈5 microseconds), and requires no processor overhead. The Meiko computer is almost certain to outperform and have superior cost-performance characteristics to the DARPA's multicomputers. Ironically, Meiko won the LLNL summer 1992 RFP for an MPP computer and delivery began in late 1993.

In contrast to Thinking Machines, Meiko demonstrates that a very good multicomputer can be made by a small, self-funded, experienced team. For a decade, Meiko has been small, growing, and profitable without government support. A level playing field would most likely change its position.

6. TRON is a Japanese program aimed at designing a microprocessor architecture and operating system family to span a range of use from PCs, to real time computers. Government funding supported the design and development of these systems by several manufacturers. Although the program started in1985?, nothing noteworthy has come of the effort. Instead, the Japanese electronics manufacturers are partnering with US companies to produce microprocessors and use dialects of UNIX and Microsoft's NT.

6. Five other companies: ACRI (French and European effort ) an ill-managed effort aimed at extracting Eurodollars and getting many groups involved in designing a supercomputer, Herb Sullivan's Chopp (technology licensing), Cray Computer, Exa (special purpose CFD computer), Parsytec (Transputer-based, but in transition to other nodes).

7. Status unknown: Scientific Solutions Inc. (former DARPA or NASA program manager of an unrelated effort starting up to get DARPA funding for a computer).

8. 1980s dead technical computers including super-minicomputers, mini-supercomputers, supercomputer, and massively parallel companies: Alliant, American Supercomputer, Ametek, AMT, Astronautics, BBN Supercomputer, Biin, CDC (independent of ETA), Cogent, Culler, Cydrome, Dennelcor, Elexsi, ETA, Evans & Sutherland Supercomputers, Flexible, Floating Point Systems, Gould/SEL, IPM, Key, Multiflow, Myrias, Pixar, Prisma, SAXPY, SCS, Supertek (part of Cray), Suprenum (German National effort), Stardent (Ardent+Stellar), Steve Chen and IBM's Supercomputer Systems Inc., Synapse, Vitec, Vitesse, Wavetracer.

Most of these were not the result of funding or market size, merely poor product ideas, poor or untimely engineering, or poor overall management. With the exception of perhaps one or two cases (e.g., Multiflow), that any of these companies really contributed anything unique to computing. As such the technical community is better off not having to waste energy on dead-end machines.

-----------------------

[1]High Tech Ventures: The Guide to Entrepreneurial Success. Addision-Wesley enumerates advances.

[2]Not a computer system per se.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download