Internal Representation of character



Internal Representation of character

In computer, each character is represented with a unique 8-bit code (integer values). The output of the following program illustrates this point.

#include

main() {

char string[]="Apa 2 U!";

int i=0;

while(string[i]!='\0') {

printf("%d ", string[i]);

i++;

}

}

Here, we see that in fact, character ‘A’ is represented with value 65,’p’ – 112 ‘a’ – 97 ,

‘ ‘ – 32, ‘2’ – 50 and so on.

Notice the following important fact and their implication:

1) Upper case and lower case alphabets has different code. Hence, this must be taken into account when comparing string; “ABU” and “abu” is not the same

2) Character ‘2’ in computer is not 2 (0000 0010) but is 50 (0011 0010). Hence, computer “see” ‘2’ and 2 differently

There are 2 most common used code set –

1. ASCII (American Standard Code for Information Interchange)

2. EBCDIC (Extended Binary Coded Decimal Interchange Code)

Microsoft window operating system use ASCII code set. Here are a portion of the ASCII code table, showing mainly the printable character

| |0 |1 |2 |3 |4 |5 |6 |7 |8 |9 |

|30 |- | | |! |" |# |$ |% |& |' |

|40 |( |) |* |+ |, |- |. |/ |0 |1 |

|50 |2 |3 |4 |5 |6 |7 |8 |9 |: |; |

|60 |< |= |> |? |@ |A |B |C |D |E |

|70 |F |G |H |I |J |K |L |M |N |O |

|80 |P |Q |R |S |T |U |V |W |X |Y |

|90 |Z |[ |\ |] |^ |_ |` |a |b |c |

|100 |d |e |f |g |h |i |j |k |l |m |

|110 |n |o |p |q |r |s |t |u |v |w |

|120 |x |y |z |{ || |} |~ | |Ç |ü |

Notice the arrangement of the codes for lower and upper case alphabets as well as digit. They are arranged according to the natural order of the characters.

lower case alphabet : ‘a’ , ‘b’ , ‘c’,…,’z’ , ASCII codes 97 , 98 , 99 , … , 122

upper case alphabet : ‘A’ , ‘B’ , ‘C’,…,’Z’ , ASCII codes 65 , 66 , 67 , … , 90

numeric digit : ‘0’ , ‘1’ , ‘2’ …, ‘9’ , ASCII codes 48 , 49, 50, … , 57

This recognizing such arrangement is important to develop character and string related processing.

The following logical expression condition to determine whether a given character is a numeric digit:

if (my_char>=’0’ && my_char=’A’ && my_char=’0’ && my_char='a' && my_char='A' && my_char='a' && my_char='A' && my_char='a' && my_char='A' && my_char='a' && my_char='A' && my_char='0' && string[length]=0; i--) {

result += char_to_decimal (string[i])*exponent;

exponent *= 10;

}

return result;

}

-----------------------

65 112 97 32 50 32 85 33

The output is

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download