PDF Assignment 1 Solution - Arizona State University

[Pages:6]Computer Architecture-I

Assignment 1 Solution

1. Die Yield is given by the formula,

Die Yield = Wafer Yield x (1 + (Defects per unit area x Die Area)/a)-a

Let us assume a wafer yield of 100% and a ~ 4 for current technology.

a. Die yield for AMD Opteron, Die yield = (1 + (0.75 x 1.99)/4) -4 = 0.281

b. Die yield for 8-core SUN Niagara, Die yield = (1 + (0.75 x 3.80)/4) -4 = 0.116

c. The defect rate for both, the AMD Opteron and SUN Niagara is the same. But, the size of the die for Niagara is almost twice as that of AMD Opteron. Thus, the number of dies per wafer reduces significantly for the Niagara. Since the defect rate is same, the yield of Niagara suffers in comparison to Opteron.

4. Question 1.4 a. In order to compute the wattage for the server's power supply, we need to first calculate the power consumed by the entire system.

i. Sun Niagara 8 -core chip : Power Consumed at max load = 79W

ii. 2 x 1GB 184-pin Kingston DRAM : Power consumed at max load = 2 x 3.7W = 7.4W

iii. 2 x 7200rpm Hard Drive : Since, we are interested in max. load condition, we assume 0% idle time for the hard drive. Power = 7.9W Total power for 2 Drives = 15.8W

Thus, the total power consumed by the system = (79+7.4+15.8) = 102.2W.

Power Supply Efficiency = PowerO/P/ PowerI/P Thus, PowerI/P = 102.2/0.7 = 146W This is the required power supply wattage for the system.

b. The hard drive is idle for 40% of the time. Power = (0.4 x 4) + (0.6 x 7.9) = 6.34W

c. Since rpm is the only factor affecting idle time of a disk, the disk rpm is directly proportional to the read/seek and idle time of the disk. The disk with 7200 rpm has a read/seek of 60%. Then, for the same set of transactions, the 5400 rpm disk will take 4/3 more time than the 7200 rpm disk i.e. 80% read/seek. Thus , the 5400 rpm disk will idle for 20%.

6. Question 1.6 a. Performance/Power Ratio for each benchmark has been tabulated below,

Benchmark SPECjbb SPECWeb

Sun Fire T2000 212.677 42.427

IBM x346 91.289 9.926

b. If power is the main concern, the Sun Fire T2000 is a better choice since it has lower power consumption for both the benchmarks.

c. It is true that "For database benchmarks, the cheaper the system, the lower cost per database operation the system is". Even so, some server farms may go for expensive servers. These servers are equipped not only for better performance, but also lower power consumption. Power consumption is an ever-growing concern with large server farms which may consist of over 10000 processors and disks. Cheaper systems might yield a lower cost per operation which is desirable. But these systems may not be power efficient. The cost incurred due to excess power consumption, cooling costs is quite significant. Thus, it is necessary to weigh both these factors when making the choice.

9. Question 1.9 a. FIT = 100 Since FIT is given in billions of hours,

MTTF = 109/FIT = 109/100 = 107 hours

b. MTTR = 1 day = 24hours

Availability = MTTF/ (MTTF + MTTR) = 0.9999

12.Question 1.12 a. Tabulated results for performance normalized to the Pentium D820

Chip Athlon64 X2 4800+

Pentium EE840 Pentium D820 Athlon64 X2 3800+

Pentium 4 Athlon64 3000+ Pentium 4 570

Processor X

Memory Performance 1.141 1.076 1

0.980333333 0.910333333 0.984333333

1.167 2.333333333

Dhrystone Performance 1.361235217 1.241327201 1 1.12542707 0.500722733 0.501182654 0.73653088 0.328515112

b. Arithmetic Mean of Performance for each processor tabulated for both original and normalized performance values.

Chip Athlon64 X2 4800+

Pentium EE840 Pentium D820 Athlon64 X2 3800+

Pentium 4 Athlon64 3000+ Pentium 4 570

Processor X

Arithmetic Mean for Original Results 12070.5 11060.5 9110 10035 5176 5290.5 7355.5 6000

Arithmetic Mean for Normalized Results

1.251117608 1.158663601

1 1.052880201 0.705528033 0.742757994 0.95176544 1.330924223

c. From the table above, one can draw a conflicting conclusion in reference to the performance of Processor X. If one examines the performance given in the first column, it is clear that the processors viz. Athlon64 X2 4800+, Pentium EE840, Pentium D820, Athlon64 X2 3800+ and the Pentium 4 570 are all faster than Processor X. This is contrary to the results in the second column where Processor X is faster than all of the said processors.

d. Geometric mean for Dhrystone benchmark for the single and dual core processors is given below:

Geometric mean (Single Core) = 0.4964 Geometric mean (Dual Core) = 1.1743

e. The scatter graph for Dhrystone performance Vs Memory Performance is given below:

f. The scatter graph clearly indicates that the dual core processors outperform their single core counterparts in Dhrystone performance. The Dhrystone benchmark is an integer benchmark which primarily exercises the logical/arithmetic functionality in CPU. The dramatic improvement in Dhrystone performance can be justified simply by the fact that there are 2 cores available for computation instead of 1. It can also be seen that there is no major improvement in memory performance. This is because the latency in memory is not related to number of CPU cores available. Thus, even if the processor is a dual core, the latency in memory load/store operations is similar to the single core. The only exception to this is the memory performance of Processor X which is fictitiously high.

13.Question 1.13

a. It is given that 40% of operations are memory centric and 60% are CPU-centric. Following table gives the weighted execution times for the benchmarks.

Chip

Athlon64 X2 4800+ Pentium EE840 Pentium D820

Athlon64 X2 3800+ Pentium 4

Athlon64 3000+ Pentium 4 570 Processor X

Execution Time

Memory Benchmark

Dhrystone Benchmark

0.000292141 0.000309789 0.000333333 0.00034002 0.000366166 0.000338639 0.000285633 0.000142857

4.82672E -05 5.29297E -05 6.5703E - 05 5.83805E -05 0.000131216 0.000131096 8.92061E -05

0.0002

Weighted Arithmetic

Mean 0.00015 0.00016 0.00017 0.00017 0.00023 0.00021 0.00017 0.00018

b. Since the application suite is CPU-intensive, we consider the Dhrystone performance of the two CPUs in comparison. Speed-up from Pentium 4 570 to Athlon64 X2 4800+ can be measured as the ratio of their Dhrystone performance.

Hence Speed-up = 20718/11210 = 1.848

c. Let the required ratio of memory-processor computation be `a'. Then, for equal performance, we can consider the following equation. 3501a + 11210(1-a) = 3000a + 15220(1-a) Thus, 4511a = 4010 i.e. a = 0.89

Thus, the performance of Pentium 4 570 equals Pentium D 820 when there are 89% memory operations and 11% processor operations.

14.Question 1.14

According to Amdahl's Law, speed up is given by,

Speed-upsystem = (Execution Time)old/(Execution Time)new

= 1/((1 ? Fractionenhanced) + (Fractionenhanced/Speed-upenhanced))

a. The first application is run in isolation and 40% of it is parallelizable. Thus, Fractionenhanced = 0.4.

Also, since the new processor is a dual core, Speed-upenhanced = 2. Then, the overall speed-up is given by the formula above. Speed-upsystem = 1.25

b. The second application is run in isolation and 99% of it is parallelizable. Hence, we have:

Fractionenhanced = 0.4 Speed-upenhanced = 2 Thus, Speed-upsystem = 1.98

c. Now both, the first and second application are running on the system. Since, the first application uses 80% of system resources, only 40% of 80% (= 32%) will be enhanced by a factor of `2'.

Thus, Speed-upsystem = 1.19

d. Similar to the solution above, 99% of 20% (= 19.8%) will be enhanced by a factor of `2'.

Thus, Speed-upsystem = 1.10

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download