Bnf - De Montfort University



CSCI 1802

Computer Systems and Networks

Part 1: Computer Number Systems

References and further reading:

Blundell B: Computer Systems and Networks

Patt & Patel: Introduction to Computing Systems pp 1 – 50

Ceri, Mandroli, Sbattella: The Art & Craft of Computing

Computer Number Systems

1 Motivation

At the lowest level a computer has an electronic component that stores either a high voltage, or a low voltage. We take a high voltage to represent the digit 1, and a low voltage to represent the digit 0. So now we have a very simple and very limited counting device.

With two such components we have four states: 00,01,10 and 11.

These four states can represent the numbers 0, 1, 2, 3.

Similarly with 3 components we can represent eight states: 000,001,010,011,100,101,110 and 111.

These eight states can represent the numbers 0, 1, 2, 3, 4, 5, 6 and 7.

1.1 Question

How many states are there with:

four components?

five components?

six components?

seven components?

eight components?

What is the pattern?

When computer scientists and engineers work at the level of the computer, they represent numbers using binary digits 0, 1. As shorthand they use octal digits (0,1,2,3,4,5,6,7) to represent a group of 3 binary digits; or hexadecimal digits (0,1,2,3,4,5,6,7,8,9,A, B, C, D, E, F) to represent a group of 4 binary digits.

The first section explains how to work with these number systems.

First, we must be able to understand raising a number to a power.

Can you do the following (without a calculator!)?

(Note that any number to the power 0 equals 1 i.e. n0 = 1)

22 = 23 = 32 = 33 = 104 =

20 = 50 =

10-2 = 2-2 = 10-5 = 3-3 =

21/2 = 20.5 = 21/3 =

2 Radix (or Base)

Numbers are written using a place system, e.g. Hundreds, Tens and Units. For example, 123 is the number where 1 represents one 100, 2 represents 2 tens, and 3 represents 3 units (or ones). Altogether the number is 100+20+3. Each place in the number system is 10 times larger than the previous place. This idea can be extended to other number bases.

• Decimal is base 10 (or radix 10) and uses the symbols 0 1 2 3 4 5 6 7 8 9

The number 271 is 2×102 + 7×101 + 1×100

• Binary is base 2 (or radix 2) and uses the symbols 0 1

The number 011012 is 0×24 + 1×23 + 1×22 + 0×21 + 1×20

• Octal is base 8 (or radix 8) and uses the symbols 0 1 2 3 4 5 6 7

The number 51728 is 5×83 + 1×82 + 7×81 + 2×80

• Hexadecimal is base 16 (or radix 16) and uses the symbols 0 1 2 3 4 5 6 7 8 9 A B C D E F

where A = 10, B = 11, C = 12, D = 13, E = 14, F = 15

The number 5A1B16 is 5×163 + 10×162 + 1×161 + 11×160

= 20480 + 2560 + 16 + 11

= 2306710

Ref: Blundell pp10-13, pp36-39

Exercises

1. List some of the different number bases you use in everyday life. Hint: think about time …

3 To Translate from one base to another

3.1 Binary to Decimal

As illustrated above, starting at the right hand side, the digits represent increasing powers of 2.

111010012 = 1×27 + 1×26 + 1×25 + 0×24 + 1×23 + 0×22 + 0×21 + 1×20

= 128 + 64 + 32 + 0 + 8 + 0 + 0 + 1

= 23310

The easiest way is to use a table. e.g. to convert 111010012

|Powers of 2 |27 |

Here’s a more complicated example:

01011011111010012

= 0+16384+0+4096+2048+0+512+256+128+64+32+0+8+0+0+1

= 2352910

We can write this as a table as shown below.

|Powers of 2 |215 |

3.2 Binary to Octal

Starting from the right hand side, divide the number into groups of 3 digits, and then convert each group into an octal value. This works because 3 binary digits represent the range of values from 0 to 7, i.e. all the octal digits.

01011011111010012 = 0 101 101 111 101 0012

= 0 5 5 7 5 1 = 0557518

3.3 Binary to Hexadecimal

Starting from the right hand side, divide the number into groups of 4 digits, and then convert each group into a hexadecimal value. This works because 4 binary digits represent the range of values from 0 to 15, i.e. all the hexadecimal digits.

01011011111010012 = 0101 1011 1110 10012

= 5 B E 9

= 5BE916

3.4 Decimal to Binary

3.4.1 Either...

Subtract from the original number the highest power of 2 that does not exceed the value of the number, put a 1 in the corresponding bit position.  Then continue down through the powers of 2 inserting 1's or 0's in the corresponding positions. (A table of powers of 2 helps here.)

| |Powers of|27 |26 |25 |24 |23 |22 |

| |2 | | | | | | |

|(+)0 |0 |0 |1 |1 |1 |0 |1 |

(+) 0×64 + 0×32 + 1×16 +1×8+ 1×4 + 0×2 + 1×1

-29 is represented by  100111012

|sign |64 |32 |16 |8 |4 |2 |1 |

|1 |1 |1 |0 |0 |0 |1 |0 |

i.e. 1×-127 + 1×64 + 1×32 + 0×16 + 0×8+ 0×4 + 1×2 + 0×1

= -127 + 98 = -29

This system is popular because it is easy to change a binary number to a negative equivalent.

To negate a binary number in one’s complement,

- just work out the positive representation

- then flip the bits. i.e. change all 0's to 1's and all 1's to 0's.

A computer can do this very quickly.

Example: -84 in 1’s complement:

+84 is represented by = 010101002

-84 is represented by = 101010112 i.e. -84 = -127 + 43

Of course this will only work for numbers in the appropriate range you need to know how many bits are being used.

4.3 Two's complement.

This is very similar to one's complement. The only difference is that the leftmost bit is -2n rather than -2n +1, e.g. for an 8-bit number it is -128 rather than -127. This is a little more consistent with the place value system, so that the leftmost place value represents a negated power of two.

e.g. -29 is represented by  -127 + 98 = 111000102 in one's complement.

= -128 + 99 = 111000112 in two's complement.

Another example: +84 and -84

1. Using the place values method

In One’s complement (and in Two’s complement):

+84 is represented by 64+16+4 = 010101002

|-127 |64 |32 |16 |8 |4 |2 |1 |

|0 |1 |0 |1 |0 |1 |0 |0 |

.

In One’s complement:

-84 is represented by -127+43 = -127+32+8+2+1 = 101010112

|-127 |64 |32 |16 |8 |4 |2 |1 |

|1 |0 |1 |0 |1 |0 |1 |1 |

In Two’s complement:

-84 is represented by -128+44 = -128+32+8+4 = 101011002

|-128 |64 |32 |16 |8 |4 |2 |1 |

|1 |0 |1 |0 |1 |1 |0 |0 |

2. An easier way to get a negative value in two’s complement is to use the algorithm:

- create the positive binary number using one's complement

- convert it to negative by flipping the bits

- then add one to the result to get the two’s complement value

|Find +84 in 1’s Comp |

|1’s Comp |

|1’s Comp |

|2’s Comp |-128 |64 |

| |Binary |Decimal | Binary |Decimal |

|1. Signed magnitude | | 01111111 = 127 |

|2. One's complement | 10000000 = -127 | |

|3. Two's complement | | |

|4. Excess 127 | | |

3. Repeat this for 16-bit numbers.

[Hints: 215 = 32768 ; Would we use Excess 127 or something else?]

5 Addition and Subtraction in Binary

5.1 Simple addition of positive integers

In decimal addition we 'carry' when the result is greater than 9. 'Carrying' works because when we get ten values in the 10n place then we have 10×10n which is the same as 1×10n+1 .

e.g. 10×102 =1×103 .

In binary addition we 'carry' when the result is greater than 1,

e.g.

1 + 1 + 1 +

0 1 1

1 1 0 1

1 1

Thus,

27 26 25 24 23 22 21 20 102 101 100

0 1 0 1 1 0 0 1 + 8 9 +

0 1 1 1 0 1 0 1 1 1 7

= 1 1 0 0 1 1 1 0 2 0 6

This works well for unsigned positive numbers.

5.2 Arithmetic using positive and negative integers

First, try the following simple decimal additions and subtractions.

Note we are using the convention here that a raised minus sign indicates a negative number

e.g. –3 means negative 3, whereas -3 means subtract 3

1 + 3 = 3 – 1 = 1 – 3 = –1 + 3 =

1 – –3 = 3 + –1 = 1 + –3 = –1 – –3 =

1 + – (–3) =

Question: Is subtracting X the same as adding –X?

Yes. Extending this to binary arithmetic we can simplify subtraction of binary numbers by adding a negative representation of the number (e.g. two's complement).

Using two’s complement representation we can add and subtract both positive and negative numbers by always using the add operation.

REMEMBER: In 2’s complement:

• a positive value is unchanged

e.g. +40 = 32 + 8 = 00101000

• a negative value has the positive value changed to the negative value by the process we’ve just learnt

e.g. -40 = - (32 + 8) = - 00101000

= 11010111+1

= 11011000

5.3 Addition and subtraction examples

Example 1: adding 2 positive values

e.g. 5 + 12

decimal 2’s complement

+5 00000101

+12 00001100 +

+17 00010001 Check the answer is correct

Example 2: subtracting second number from first number

e.g. 5 – 12 = 5 + –12

decimal 2’s complement

+5 00000101

–12 11110100 +

–7 11111001 Check the answer is correct

Example 3:

Here is an example where there is a decimal 2’s complement

‘carry’ on the left hand side of the result. –5 11111011

Because the two operands have different +12 00001100 +

signs, the carried 1 is simply ignored. +7 [1]00000111 sum = +7(

Example 4:

However, in this example, where the decimal 2’s complement

two operands both have the same sign, –80 10110000

the carried 1 on the left hand side of the –64 11000000 +

result and the fact that result is a different –144 [1]01110000 sum= +112(

sign to both operands indicates that there

has been an overflow and the result is wrong.

5.4 Exercise

By considering the column values, (i.e. –128, 64, 32 etc) explain what has occurred in the last two examples.

6 Some effects when using finite-precision numbers (i.e. using computers)

6.1 Overflow, Underflow and Rounding

Say a computer uses 8 bits to represent integers then the range of values that can be stored (assuming we use 2's complement) is -128 to +127.

Thus a computer using 8 bits to store integers cannot do every arithmetic calculation you might want to give it.

Examples:

2 + 2 = 4 (correct)

70 + 70 = 140 (too big to be represented: integer overflow)

5 × 5 = 25 (correct)

5 × 50 = 250 (too big: integer overflow)

50 – 95 = -45 (correct)

-50 – 95 = -145 (too big negative: negative overflow)

4 / 2 = 2 (correct)

5 / 2 = 2.5 (not an integer, so cannot be stored as an integer except by

rounding or truncating it)

6.2 Definitions

Integer overflow: Result too big (+ve)

Negative overflow: Result too big (–ve)

Rounding: Result cannot be stored exactly, so nearest valid value is stored.

Rounding error: Difference between correct value and calculated value.

Relative error: Rounding error (i.e. Rounding error divided by Correct value)

Correct value

6.3 Effect of order of calculations when using finite-precision numbers (i.e. using computers)

In these examples, assume 8-bit two's complement representation of integers. (Therefore, the range of values is -128 to +127)

6.3.1 Example 1:

Calculate a + b – c

Method 1) calculate b – c then add a

Method 2) calculate a + b then subtract c

[ Associative law states: a + ( b – c ) = ( a + b ) – c]

Using a = 80, b = 50, c = 30

Method 1) 80 + (50 – 30)

= 80 + 20

= 100 Method 1 gives correct answer

Method 2) (80 + 50) – 30

= (overflow) – 30

= ? Method 2 gives wrong answer!

6.3.2 Example 2:

Calculate a * ( b – c )

Method 1) calculate b – c then multiply by a

Method 2) calculate a * b then subtract a * c

[ Distributive law says: a * ( b – c ) = ( a * b ) – ( a * c ) ]

Using a = 8, b = 40, c = 30

Method 1) a * ( b – c )

= 8 * (40 – 30 )

= 8 * 10

= 80 Method 1 gives correct answer.

Method 2) ( a * b ) – ( a * c )

= ( 8 * 40 ) – ( 8 * 30 )

= 320 – 240

(overflow) – (overflow)

= (garbage) ? Method 2 gives wrong answer!

6.4 Arithmetic – computers vs humans

The conclusions from section 6 of this booklet are:

1. Arithmetic using computers is not the same as arithmetic done by people.

2. It is important to understand

• how computers work

• their limitations

7 Binary Representation of Characters

7.1 ASCII Character codes

Characters like 'a', 'b', 'c', '2', '!' are represented by number codes. ASCII (American Standard Code for Information Interchange) is a code used to encode characters. Standard ASCII uses 7 bits and allows 128 characters to be represented. Extended ASCII uses 8 bits and allows 256 characters to be represented. An additional 8th or 9th bit might be used as a parity bit, which is used to detect errors.

• odd parity - the parity bit is set to 0 or 1 to give an odd number of 1's in total.

• even parity – the parity bit is set to 0 or 1 to give an even number of 1's in total.

Standard characters that can be encoded are:

• Command characters Non-printing transmission and printer control codes 00..1F Hex

• Alphanumeric characters {0 .. 9} codes 30.. 39 Hex

{A .. Z} codes 41..5A Hex

{a .. z} codes 61..6A Hex

• Symbols punctuation and arithmetic characters

e.g. the space character is 20 in Hex

ASCII codes for printable characters

(Note that the codes are given here in Hexadecimal - why is that?)

|Hex |0 |1 |

|char |Character, not necessarily printable |1 byte ASCII code |

|int |Integer value |-32767 to +32767 |

| | |i.e. -215-1..215-1 |

|short |A short int but … |often the same as int |

|long |A long int |-2147483647 to +2147483647 |

| | |i.e. -231-1..231-1 |

|float |Real (or floating point) value |6 decimal digits of precision, |

| | |min value 1E-37, max value 1E+37, |

| | |smallest value x such that 1+x ( 1 is 1E-5 |

|double |Double precision floating point value |10 decimal digits of precision, |

| | |min value 1E-37, max value 1E+37, |

| | |smallest value x such that 1+x ( 1 is 1E-9 |

|long double |Extended precision floating point |machine/implementation dependent |

8.3 Exercise

How many bytes are required for each of these data types? Use the final column to help you fill in the blanks.

|Datatype |No of bytes required |Description/Typical values |

|char | |1 byte ASCII code |

|int | |-32767 to +32767 |

| | |i.e. -215...215-1 |

|short | |often the same as int |

|long | |-2147483647 to +2147483647 |

| | |i.e. -231...231-1 |

|float |to be discussed in the next section |6 decimal digits of precision, |

| | |min value 1E-37, max value 1E+37, |

| | |smallest value x such that 1+x ( 1 is 1E-5 |

|double |to be discussed in the next section |10 decimal digits of precision, |

| | |min value 1E-37, max value 1E+37, |

| | |smallest value x such that 1+x ( 1 is 1E-9 |

|long double |machine/implementation dependent |machine/implementation dependent |

8.4 Further reading

Blundell, B: pp 67-82

Ceri, Mandroli, Sbattella: The Art & Craft of Computing, Sections 13.1-13.4.1, pp 312-320

9 Representation of Fractional Numbers

Fractional numbers are numbers between 0 and 1. Their representation in decimal or binary simply uses negative powers of the base. Recall that a-n = 1/an

For example:

10-1 = 1/101 = 1/10 = 0.1 2-1 = 1/21 = 1/2 = 0.5

10-2 = 1/102 = 1/100 = 0.01 2-2 = 1/22 = 1/4 = 0.25

10-3 = 1/103 = 1/1000 = 0.001 2-3 = 1/23 = 1/8 = 0.125

10-4 = 1/104 = 1/10000 = 0.0001 2-4 = 1/24 = 1/16 = 0.0625

10-5 = 1/105 = 1/100000 = 0.00001 2-5 = 1/25 = 1/32 = 0.03125

etc.

The number 0.42610 is 4×10-1 + 2×10-2 + 6×10-3 + 1×10-4

The number 0.011012 is 0×2-1 + 1×2-2 + 1×2-3 + 0×2-4 + 1×2-5

9.1 To translate fractional binary to fractional decimal

Simply work out the sum using the decimal or fractional values of the bits.

e.g. 0.011012 = 0×2-1 + 1×2-2 + 1×2-3 + 0×2-4 + 1×2-5

= 0 + 0.25 + 0.125 + 0 + 0.03125

= 0.40625

(or 0.011012 = 1/4 + 1/8 + 1/32 = 13/32 = 0.40625)

A table can be used, similar to the one used to work out integer values.

|Powers of 2 |. |

9.2 To translate fractional decimal to fractional binary

9.2.1 Either...

Successively subtract the highest possible value of negative power of 2

e.g. 0.42510 - 0.25 = 0.175 (2-2 = 0.012)

0.175 - 0.125 = 0.05 (2-3 = 0.0012)

0.05 - 0.03125 = 0.01875 (2-5 = 0.000012)

etc

which gives 0.01101 to an accuracy of 5 binary digits after the point.

This can be done using a table as before:

|To convert 0.425 to |Powers of 2 |. |2-1 |

|binary | | | |

|11.010 | 1.1010 × 2+1 | 1010 |+1 |

| 0.00011 | 1.1 × 2-4 | 1000 |-4 |

| 0.010101 | 1.0101 × 2-2 | 0101 |-2 |

| 1.101 | 1.101 × 20 | 1010 |0 |

Thus, each of the values in the table above can be stored using the values of the mantissa and the exponent.

This is the basis of how real numbers are stored in computers using the IEEE 754 standard.

10.4 IEEE 754 Standard Representation of Real Numbers

This is now the standard way of representing real (binary) numbers in computers. For single precision, there are a total of 32 bits used as shown here.

|No of bits |1 |8 |23 |

|Component |S(sign) |E (exponent) |F (fraction or mantissa) |

|Bit Number(s) |31 |30 -------- 23 |22 -------------------------- 0 |

S - the sign of the number, is 0 for positive or 1 for negative

E - the exponent, uses Excess 127 representation (i.e. Excess 27-1) and can take values between 1 and 254, equivalent to a range of -126 to 127. (The values 0 and 255 are reserved for special cases.)

F - the mantissa, uses a normalised representation as described above. The leading one and the decimal point are omitted (because they can be assumed to be present). F should more correctly be called the significand.

Zero is a special number and is represented by values of E=0 and F=0 (with S taking the value of either 0 or 1).

For more detailed understanding of the IEEE 754 Standard Representation of Real Numbers you are recommended to read the reprint of a web article which is given after the following exercises.

This article will also help you with the tutorial exercises.

10.5 Exercises:

A reprint of a web article is included next.

a) Read this to find out what is meant by NaN and how NaN is represented in the IEEE standard.

b) Use the example given to help you to represent the decimal value 132.2 as a single precision IEEE 754 number.

c) Find out if there is a C library that allows you to use infinity and NaN.

Representing Real Numbers

[pic]

1. Background

Real number representation in a fixed-size word represents a challenge because they can have a varying number of decimals and an enormous range. As early as 1945, John von Neumann realised that, because of round-off errors, it would be impossible to represent real numbers exactly, as was done for integers, and he even suggested that computers should not deal with such numbers! On the hardware side, designers were reluctant to incorporate real number support in their processors because that would have a severe impact on performance and occupy valuable real estate on the processor chip. The evolution of real number representation went through a number of controversies and backward steps before a standard (IEEE 754 / 854) finally emerged in 1985. And while so many people contributed to this development over the years, it is widely believed that the work of William Kahan, which started at the University of Toronto in 1953, was instrumental, and for that, he was awarded the Turing Award in 1989.

2. Objectives

The goals of the IEEE standard are twofold:

1. Strike a compromise between range and accuracy

2. Minimize floating-point hardware by exploiting the integer unit h/w.

3. Single-Precision IEEE 754

The number is represented in 32 bits (4 bytes) using the following format:

[pic]

• S is the sign of the number (1=minus, 0=plus), contributing (-1)S.

• E is the biased exponent, contributing 2(E-127). It is represented as an 8-bit unsigned integer, and thus, have the range 0 through 255.

• F is the binary fraction (mantissa or significand), contributing 1.F; i.e. an implicit 1 and a binary point are not represented but assumed present. Together with the implicit 1, 24 bits are thus significant. Since a decimal digit can be represented in between 3 and 4 bits, we can see that this single-precision representation produces about 7 significant digits.

This standard achieves the 1st goal by using powers of 2 instead of 10. It also realises the 2nd goal because of the way it orders the field and represents the exponent: Since S is in the most significant bit, it is easy to perform a test of less than, greater than, or equal to 0 using integer unit hardware. Placing the exponent E before the mantissa also allows us to sort floating-point numbers using integer comparison instructions, as long as all exponents are non-negative. In fact, this is why the standard uses biased (or excess 127) notation instead of 2's complement.

The exponent E ranges between 0 and 255 but the end points of this interval are reserved for special values (see below). Hence, the allowed range of E is 1 though 254 which gives a magnitude range (in absolute value) of about:

2(1-127) = 2-126 = 10-38 to 2(254-127) = 2(127) = 10+38

Special Values:

• The fact that 1. is implicit means that the number zero cannot be represented and, hence, a special value must be set aside for zero.

This (zero) value is: E = F = 0 (regardless of S).

• Furthermore, instead of issuing an exception when a divide by 0 is attempted, the standard allows software to set special values to indicate plus or minus infinity, thus providing for topological closure in real arithmetic.

These (infinity) values are: E = 255, F = 0, S = 0 / 1.

• The results of invalid operations (such as 0/0 or infinity minus infinity) are indicated by a special value in the standard known as NaN (Not a Number), thus allowing program to proceed with a number of operations and postpone some tests and decisions to a later time in the program.

This (NaN) value is: E = 255, F not = 0 (regardless of S).

• The standard realises that there is a gap between 0 and the smallest positive (and largest negative) number that can be represented, and allows some numbers in this gap to be represented. Supporting such a gradual underflow means that we abandon our implicit 1. premise and replace it by 0. Such numbers are denormalized and can be as small (in absolute value) as about 2-23 x 2-126 = 2-149 = 10-45. Due to their departure from the premise, most processor makers do not support this optional feature of the standard.

These (denormalized) numbers have: E = 0, F not = 0, S = 0 / 1.

4. Double-Precision IEEE 754

This 64-bit (8 bytes) representation provides added accuracy and increased range but uses the same idea as single-precision. Here are the differences:

• 11 bits for E (with a bias of 1023) and 52 bits for F.

• Aside from special values, E ranges between 1 and 2046. The range is thus: 2-1022 (about 10-308) to 2-1023 (about 10+308).

• Based on 52+1 significant bits, the accuracy is about 15 (decimal) digits.

• Special Values:

E = 0, F = 0 (zero)

E = 0, F not 0 (denormalized)

E = 2047, F = 0 (infinity)

E = 2047, F not 0 (NaN)

5. Example

Represent 19.2 as a single-precision IEEE 754 number:

19 / 2 = 9 R 1

9 / 2 = 4 R 1

4 / 2 = 2 R 0

2 / 2 = 1 R 0

1 / 2 = 0 R 1

Hence 19 = 10011

0.2 * 2 = 0.4

0.4 * 2 = 0.8

0.8 * 2 = 1.6

0.6 * 2 = 1.2

0.2 * 2 = 0.4

0.4 * 2 = 0.8

Hence 0.2 = 001100110011..

19.2 = + 10011 . 001100110011...

= + 1. 0011 0011 0011 0011 0011 001 X 24

S = 0, E = 4+127 = 131 = 10000011 and F = 001100110011001100110

11 Discreteness of real number representation

11.1 Range and accuracy

Not all possible values can be represented by floating point representation of numbers. Only a discrete set of values can be represented. This has implications on the accuracy of computer arithmetic. e.g. try adding together a large number and a very small number.

However, the big advantage of using floating point representation is that a vast range of numbers can be represented with both very small and very large values being possible. This is because the exponent can take both positive values (to represent large values) and negative values (to represent small values).

A sign bit is used to represent the sign of the number, so both positive and negative values are possible.

The accuracy of the number represented depends on the number of bits used to represent the mantissa and real numbers will usually contain some rounding error (the difference between the correct value and its nearest expressible value).

11.2 Errors

Overflow will occur when the number is too big to be represented. This is when the exponent does not contain enough bits to hold the power of 2 that is required.

e.g, in decimal if the exponent could hold 2 digits, then there would be overflow for any value greater than 1099.

Underflow will occur when the number is too small to be represented. This is when the exponent does not contain enough bits to hold the negative power of 2 that is required.

e.g, in decimal if the exponent could hold 2 digits, then there would be underflow for any value smaller than 10-99.

The spacing between adjacent expressible numbers is not constant

e.g (using decimal notation) 0.999 × 10+99 –0.998 × 10+99 ( = 0.001 × 10+99)

is much greater than

0.999 × 10+01 – 0.998 × 10+01 (= 0.001 × 10+01)

but the relative error (caused by rounding) is approximately constant throughout the range of expressible numbers.

11.3 Further Reading

Floating-Point Fallacies by Dan Zuras:

This article starts with the following:

The puzzling behavior of some programs that use floating-point arithmetic to solve problems can often elicit some interesting questions. "Why is it that 0.2 x 5.0 results in 0.9999999 not 1.0?" "Why is sin(3.14159265) equal to -8.742276e-8 not 0.0 ?" "How can the total system usage be 100.1% ?"

In the next paragraph it continues with:

Those who use and write programs that use floating-point to solve problems have a common problem. Floating point numbers and operations are expected to behave like real numbers and operations. This quite reasonable expectation is, unfortunately, never true.

Download and read the whole article!

This page intentionally left blank

Tutorial Work Sheet 1

Binary, Octal, Hex

These exercises should be attempted during the staffed tutorial session. You can ask for help at any time during the tutorial.

If you do not finish the exercises during the tutorial, you should complete them in your own time before the next tutorial. If you have any questions, ask at the start of the next tutorial.

Tutorial Exercises

1. Convert the binary numbers below to decimal.

(a) 1010 (b) 1111 (c) 00010000 (d) 00010101 (e) 00000101 (f) 11000110

2. Convert each decimal number below to binary.

(a) 64 (b) 15 (c) 234 (d) 1 (e) 100 (f) 456

3. Perform the following binary additions. (Hint: write each pair of numbers one below the other and, if necessary, extend both to 8 bits by adding extra 0’s at the left hand side.)

(a) 1+1 (b) 00110100 + 00001011 (c) 001010 + 001111 (d) 00110111 + 11000

4. Perform the following binary subtractions.

Difficult isn’t it! Don’t worry if you can’t do this question, you’ll learn a better way of doing this soon.

(a) 0111 - 0101 (b) 1101 - 1010 (c) 1111 - 0110 (d) 01110110 - 01001101

5. What are the 16 digits of the hexadecimal system? Now convert the hex numbers below to binary.

(a) FFFF (b) AF12 (c) 3457

6. Convert the following decimal numbers to binary and then to hexadecimal.

(a) 59 (b) 29 (c) 842 (d) 3595 (e) 32

/ continues on next page

7. Convert the binary numbers below to their octal and hex equivalents. (Hint: you may need to put extra 0’s on the left hand side to get the right number of bits.)

(a) 1110 (b) 11011 (c) 110110101 (d) 1010111101110010

8. Some computers represent numeric data using Binary Coded Decimal (BCD) where 4 bits are used to represent each decimal digit. For example, 0010 1001 is the BCD form of the decimal number 29. What decimal numbers do each of the following BCDs represent?

(a) 0101 0001 (b) 1000 1000 1000 (c) 0110 0011 (d) 0111 0001 0010

Questions from Blundell:

Activities 1.4 to 1.8

Discussion Questions

1. Given a binary number, is it true that the 1's complement of its 1's complement is the original number? Can you give a reason? Is the same true for 2's complement?

2. Assume that X, Y and Z are digits. Discuss whether it is always true that XYZ8 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download