The Power-Law of Top500



The Power-Law of Top500

Matei Ripeanu

(matei@cs.uchicago.edu)

Most natural and technical phenomena are characterized by highly unbalanced distributions: there are few powerful, devastating earthquakes and countless unnoticeable ones; there are few machines with a peak flops rate larger than 1 tflops while millions of machines work at around 1 mflops. Many of these events (city sizes, incomes, word frequency [1, 2]) fit power-law distributions: the number of events of a certain size is proportional to the size of the event to a negative constant power.

There are many dimensions of variation for entities participating in the Internet: from the obvious ones like CPU speed, available disk space and network bandwidth, to more elaborate ones such as inter-failure time, node trustworthiness, or reliability. We conjecture that these follow similar, highly heterogeneous distributions. Preliminary results support our intuition: Internet’s autonomous system size [3], node bandwidth for nodes in Gnutella network [4, 5] or CPU power for machines in Top500 list [6], all follow power-law distributions (or at least highly variable distributions that can be well approximated as power-laws).

We use an example to depict the characteristics of this distribution: peak mflops rate of the world’s most powerful supercomputers follows a power-law distribution for all years for which data is available (see Figure 1). If we make the assumption that the same distribution extends to machines beyond the Top500, even more interesting is perhaps the heavy-tail property. As a result, aggregating computers in the tail results into a powerful machine comparable with the top ones. By analyzing data for the last seven years, one can notice a fascinating trend: the heavy-tail property becomes more accentuated (Figure 2 shows that the power constant of these distributions gets closer to zero). If this trend persists, the interest will continue to shift from building large machines to large-scale integrations of less powerful systems.

References

[1] M. Schroeder, Fractals, Chaos, Power Laws : Minutes from an Infinite Paradise: W.H. Freeman and Company, 1991.

[2] N. Shiode and M. Batty, "Power Law Distributions in Real and Virtual Worlds," presented at INET 2000, Yokohama, Japan, 2000.

[3] H. Tangmunarunkit, S. Doyle, R. Govindan, S. Jamin, S. Shenker, and W. Willinger, "Does AS Size Determine Degree in AS Topology?," ACM Computer Communication Review, 2001.

[4] S. Saroiu, P. K. Gummadi, and S. D. Gribble, "A Measurement Study of Peer-to-Peer File Sharing Systems," presented at Multimedia Computing and Networking Conference (MMCN), San Jose, CA, USA, 2002.

[5] M. Ripeanu, I. Foster, and A. Iamnitchi, "Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design," Internet Computing Journal, vol. 6, 2002.

[6] "TOP500 Supercomputer Sites - ," 2002.

-----------------------

[pic]

Figure 1: Peak processing rate (gflops) for world’s fastest supercomputers in Top500 list from 1995 to 2001. Each series of points represents one year on this log-log plot.

[pic]

Figure 4: Evolution of power-law distribution coefficient k over time. As k gets closer to 0 the distribution is more heavy-tailed. Note that k decreases on average 2% per year.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download