Crash Course on Character Encodings

Crash Course on Character Encodings

Yusuke Shinyama

NYCNLP Oct. 27, 2006

Introduction

2

Are they the same?

? Unicode ? UTF

3

Two Mappings

Character

Character Code

"

64

1590

32654

Byte Sequence

64 216 182 231 190 142

4

Two Mappings

Character

Unicode

Character Code

UTF-8

"

64

Byte Sequence

64

1590

216 182

32654 231 190 142

"Character Set" "Encoding Scheme"

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download