Miscellaneous Notes
Db2 for z/OS CCSID Overview
Db2 for z/OS leverages the z/OS operating system’s Conversion Services facilities to enable Db2 support of character data stored in various encoding schemes and code pages. These z/OS services enable Db2 for z/OS to work with (i.e. receive as input, process, store, and retrieve) character data represented by the various code pages in the encoding schemes EBCDIC, ASCII, and UNICODE.
This paper summarizes, at a moderately technical level, the ways and methods in which Db2 for z/OS can be configured and employed to manage character data from a disparate set of encoding schemes and code pages. It also discusses the ways in which Db2 for z/OS leverages z/OS Conversion Services for the purposes of managing and manipulating character data.
For the remainder of this discussion, the term Db2 will mean Db2 for z/OS (as opposed to Db2 LUW, etc.). This discussion is exclusively for Db2 for z/OS, where the term Db2 is concerned and used. References to z/OS are exclusively for those services and programs executing in a z/OS operating system.
This discussion pertains only to character data, never to numeric data or binary (BIT) data. For this purpose, one should note the difference between the character ‘1’ and the number 1. ‘1’ is a character data type, 1 is a numeric data type. This discussion only pertains to character data types. Numeric and binary (BIT) data in Db2 are not a part of this discussion.
DISCLAIMER: The following discussion is a personal interpretation of the IBM provided documentation on these topics. Because interpretations of technical documentation may vary from person to person, and are therefore subject to scrutiny, the user should always rely on IBM documentation for the definitive and authoritative explanation of each topic.
This paper is as of Db2 12 for z/OS and z/OS 2.2, and was written on Oct. 12, 2017 by:
Kurt Bohnert
Manager, Db2 Systems Programming
Rocket Software, Inc.
In computing, an “encoding scheme” can be defined as a set of specific definitions and rules that describe a philosophy used to represent character data. One may think of an encoding scheme as simply a set of rules used for representing character data.
Some examples of encoding scheme rules might include (but are not limited to) the following: The number of bits used for each character, the number of bytes used for each character, the allowable ranges of bytes, maximum number of characters, … there are many other possible rules that might apply in an encoding scheme. The rules of an encoding scheme govern the way in which characters in the encoding scheme are represented in computing.
Db2 for z/OS is enabled to work with (manage) data governed by any one of the following three encoding schemes: EBCDIC, ASCII, and UNICODE.
Within each encoding scheme, there can be many “code pages”. A code page consists of a table of values that describes the character set for a given language. Each code page in an encoding scheme must conform to the specific rules for that encoding scheme. The various code pages in an encoding scheme are different however, in that they may represent different languages (as one example), yet they all conform to the rules of the encoding scheme in which they reside. There will be examples of this later in this paper.
IBM Db2 employs an IBM concept called a “CCSID”. CCSID stands for Coded Character Set ID. A CCSID identifies an encoding scheme and a code page for that encoding scheme. Although a CCSID is technically a more precise definition of a code page, one may think of a CCSID and a code page as synonymous for the remainder of this paper. In those cases where it is important to distinguish between the two, the distinction will be made.
Db2 assigns a CCSID to all character data it manages. All character data passed to Db2 from an application/agent is assigned a CCSID. The character data is said to be “tagged” with the CCSID. The CCSID is stored with the character data if the data is stored in a Db2 data base. It becomes part of the “metadata” for that character data. Metadata is “data about data”. Metadata contains information about the data being managed. CCSID is always a part of the metadata for character data stored in Db2 and manipulated by Db2.
CCSID’s in Db2 are grouped in three’s … referred to in this discussion as “triplets”. One of the three CCSID’s in a triplet is the Single-Byte-Character-Set (SBCS), one of the three CCSID’s is the Double-Byte-Character-Set (DBCS), and one of the three CCSID’s is the MIXED character set. This will be discussed in greater detail below. The important point to note for now is that Db2 CCSID’s come in three’s … in this discussion referred to as triplets.
So CCSID’s are Db2’s implementation of character data management within the Db2 DBMS (Data Base Management System). From this point forward in the paper, we are only concerned with how, when, and where Db2 employs CCSID’s in support of character data interpretation, manipulation, and storage.
Db2 recognizes many CCSID triplets for both the EBCDIC and ASCII encoding schemes. Again, a triplet is three CCSID’s … a SBCS CCSID, a DBCS CCSID, and a MIXED CCSID. Db2 only recognizes one CCSID triplet for UNICODE however, because UNICODE (unlike EBCDIC and ASCII) only contains one triplet. As will be seen later in this paper, UNICODE is an industry wide attempt to standardize and unify all character schemes (i.e. all the various characters) from all over the computing world, into a single triplet. UNICODE can be thought of as a single code page (triplet) containing all characters from all other encoding schemes and code pages. This will be discussed in greater detail below.
It should be noted here that (as previously stated) a CCSID is a more precise definition of a code page. This means (among other things) that there can be a one-to-many relationship between code page and CCSID.
Here is an example of two CCSID’s in the same code page within the EBCDIC encoding scheme. EBCDIC code page 37 has (at least) two CCSID’s. They are CCSID 0037 and CCSID 1140. The only difference between the two CCSID’s is at code point 9F in the two CCSID tables. In CCSID 0037 you will find the "¤" (currency sign) character at code point 9F, but in CCSID 1140 at code point 9F you will find the "€" (Euro sign) character. CCSID 0037 might be considered a standard SBCS CCSID in North America, and CCSID 1140 a standard SBCS CCSID in Europe.
On the next page is a copy of the CCSID table for CCSID 0037. It was cut and paste off the internet. You can find many such examples of CCSID’s and code pages on the internet. In fact, you can likely find the table for every CCSID and code page ever constructed, on the internet.
Note in the example below that the code point 9F contains the "¤" (currency sign) character. If one looked at CCSID 1140, one would see that the only difference between the two CCSID’s is at this code point (9F), where CCSID 1140 contains the "€" (Euro sign) character.
CCSID 1140 is the Euro currency update of CCSID 0037. In CCSID 1140, the "¤" (currency sign) character at code point 9F is replaced with the "€" (Euro sign) character.
See code point 9F in red, in the table below.
|CCSID 0037 |
|_0 |_1 |_2 |_3 |_4 |_5 |_6 |_7 |_8 |_9 |_A |_B |_C |_D |_E |_F | |
0_
|NUL
0000
0 |SOH
0001
1 |STX
0002
2 |ETX
0003
3 |SEL
009C
4 |HT
0009
5 |RNL
0086
6 |DEL
007F
7 |GE
0097
8 |SPS
008D
9 |RPT
008E
10 |VT
000B
11 |FF
000C
12 |CR
000D
13 |SO
000E
14 |SI
000F
15 | |
1_
|DLE
0010
16 |DC1
0011
17 |DC2
0012
18 |DC3
0013
19 |RES ENP
009D
20 |NL
0085
21 |BS
0008
22 |POC
0087
23 |CAN
0018
24 |EM
0019
25 |UBS
0092
26 |CU1
008F
27 |IFS
001C
28 |IGS
001D
29 |IRS
001E
30 |IUS ITB
001F
31 | |
2_
|DS
0080
32 |SOS
0081
33 |FS
0082
34 |WUS
0083
35 |BYP INP
0084
36 |LF
000A
37 |ETB
0017
38 |ESC
001B
39 |SA
0088
40 |SFE
0089
41 |SM SW
008A
42 |CSP
008B
43 |MFA
008C
44 |ENQ
0005
45 |ACK
0006
46 |BEL
0007
47 | |
3_
|
0090
48 |
0091
49 |SYN
0016
50 |IR
0093
51 |PP
0094
52 |TRN
0095
53 |NBS
0096
54 |EOT
0004
55 |SBS
0098
56 |IT
0099
57 |RFF
009A
58 |CU3
009B
59 |DC4
0014
60 |NAK
0015
61 |
009E
62 |SUB
001A
63 | |
4_
|SP
0020
64 |RSP
00A0
65 |â
00E2
66 |ä
00E4
67 |à
00E0
68 |á
00E1
69 |ã
00E3
70 |å
00E5
71 |ç
00E7
72 |ñ
00F1
73 |¢
00A2
74 |.
002E
75 |<
003C
76 |(
0028
77 |+
002B
78 ||
007C
79 | |
5_
|&
0026
80 |é
00E9
81 |ê
00EA
82 |ë
00EB
83 |è
00E8
84 |í
00ED
85 |î
00EE
86 |ï
00EF
87 |ì
00EC
88 |ß
00DF
89 |!
0021
90 |$
0024
91 |*
002A
92 |)
0029
93 |;
003B
94 |¬
00AC
95 | |
6_
|-
002D
96 |/
002F
97 |Â
00C2
98 |Ä
00C4
99 |À
00C0
100 |Á
00C1
101 |Ã
00C3
102 |Å
00C5
103 |Ç
00C7
104 |Ñ
00D1
105 |¦
00A6
106 |,
002C
107 |%
0025
108 |_
005F
109 |>
003E
110 |?
003F
111 | |
7_
|ø
00F8
112 |É
00C9
113 |Ê
00CA
114 |Ë
00CB
115 |È
00C8
116 |Í
00CD
117 |Î
00CE
118 |Ï
00CF
119 |Ì
00CC
120 |`
0060
121 |:
003A
122 |#
0023
123 |@
0040
124 |'
0027
125 |=
003D
126 |"
0022
127 | |
8_
|Ø
00D8
128 |a
0061
129 |b
0062
130 |c
0063
131 |d
0064
132 |e
0065
133 |f
0066
134 |g
0067
135 |h
0068
136 |i
0069
137 |«
00AB
138 |»
00BB
139 |ð
00F0
140 |ý
00FD
141 |þ
00FE
142 |±
00B1
143 | |
9_
|°
00B0
144 |j
006A
145 |k
006B
146 |l
006C
147 |m
006D
148 |n
006E
149 |o
006F
150 |p
0070
151 |q
0071
152 |r
0072
153 |ª
00AA
154 |º
00BA
155 |æ
00E6
156 |¸
00B8
157 |Æ
00C6
158 |¤
00A4
159 | |
A_
|µ
00B5
160 |~
007E
161 |s
0073
162 |t
0074
163 |u
0075
164 |v
0076
165 |w
0077
166 |x
0078
167 |y
0079
168 |z
007A
169 |¡
00A1
170 |¿
00BF
171 |Ð
00D0
172 |Ý
00DD
173 |Þ
00DE
174 |®
00AE
175 | |
B_
|^
005E
176 |£
00A3
177 |¥
00A5
178 |·
00B7
179 |©
00A9
180 |§
00A7
181 |¶
00B6
182 |¼
00BC
183 |½
00BD
184 |¾
00BE
185 |[
005B
186 |]
005D
187 |¯
00AF
188 |¨
00A8
189 |´
00B4
190 |×
00D7
191 | |
C_
|{
007B
192 |A
0041
193 |B
0042
194 |C
0043
195 |D
0044
196 |E
0045
197 |F
0046
198 |G
0047
199 |H
0048
200 |I
0049
201 |SHY
00AD
202 |ô
00F4
203 |ö
00F6
204 |ò
00F2
205 |ó
00F3
206 |õ
00F5
207 | |
D_
|}
007D
208 |J
004A
209 |K
004B
210 |L
004C
211 |M
004D
212 |N
004E
213 |O
004F
214 |P
0050
215 |Q
0051
216 |R
0052
217 |¹
00B9
218 |û
00FB
219 |ü
00FC
220 |ù
00F9
221 |ú
00FA
222 |ÿ
00FF
223 | |
E_
|\
005C
224 |÷
00F7
225 |S
0053
226 |T
0054
227 |U
0055
228 |V
0056
229 |W
0057
230 |X
0058
231 |Y
0059
232 |Z
005A
233 |²
00B2
234 |Ô
00D4
235 |Ö
00D6
236 |Ò
00D2
237 |Ó
00D3
238 |Õ
00D5
239 | |
F_
|0
0030
240 |1
0031
241 |2
0032
242 |3
0033
243 |4
0034
244 |5
0035
245 |6
0036
246 |7
0037
247 |8
0038
248 |9
0039
249 |³
00B3
250 |Û
00DB
251 |Ü
00DC
252 |Ù
00D9
253 |Ú
00DA
254 |EO
009F
255 | | |_0 |_1 |_2 |_3 |_4 |_5 |_6 |_7 |_8 |_9 |_A |_B |_C |_D |_E |_F | |
Now that we have defined many of the pertinent terms, it is time to discuss how, when, and where Db2 employs CCSID’s in support of character data interpretation, manipulation, and storage. The important point to keep in mind during this discussion is this: Every character string used in a Db2 SQL operation, and every character string manipulated and/or stored by Db2 in any way, is tagged with a CCSID.
Recall that CCSID’s in Db2 come in three’s … there are three (3) CCSID’s associated together into a related group (a triplet):
o SBCS Single Byte Character Set CCSID
o DBCS Double Byte Character Set (also called GRAPHIC) CCSID
o MIXED Contains both SBCS data and DBCS data. SBCS and DBCS
are both always a subset of MIXED for their triplet
All CCSID triplets reside within one of the Encoding Schemes supported by Db2 for z/OS: EBCDIC, ASCII, or UNICODE.
When a Db2 subsystem is initially configured (built), there is an Application Programming Defaults load module created named DSNHDECP. This load module contains several application default parameters which are directly associated with this discussion. Here is a sample list of all parameters defined in DSNHDECP for Db2 V11. The parameters of interest in this discussion are in bold blue text:
ASCCSID=1115,
AMCCSID=1381,
AGCCSID=1380,
SCCSID=836,
MCCSID=935,
GCCSID=837,
USCCSID=367,
UMCCSID=1208,
UGCCSID=1200,
ENSCHEME=EBCDIC,
APPENSCH=EBCDIC,
DATE=USA,
DATELEN=0,
DECARTH=DEC15,
DECIMAL=COMMA,
DEF_DECFLOAT_ROUND_MODE=ROUND_HALF_EVEN,
DEFLANG=IBMCOB,
DELIM=DEFAULT,
IMPLICIT_TIMEZONE=CURRENT,
MIXED=YES,
NEWFUN=V11,
PADNTSTR=YES,
SQLDELI=DEFAULT,
DSQLDELI=APOST,
SSID=QB1M,
STDSQL=NO,
TIME=JIS,
TIMELEN=0,
DYNRULS=YES,
LC_CTYPE=
A default CCSID triplet is assigned for each of the three encoding schemes employed by Db2 (EBCDIC, ASCII, and UNICODE) in DSNHDECP (above) at subsystem installation. In our example above the defined defaults are:
ASCCSID=1115,
AMCCSID=1381,
AGCCSID=1380,
SCCSID=836,
MCCSID=935,
GCCSID=837,
USCCSID=367,
UMCCSID=1208,
UGCCSID=1200,
Along with these default CCSID triplets, three other related parameters are assigned in DSNHDECP:
ENSCHEME=EBCDIC,
APPENSCH=EBCDIC,
MIXED=YES,
Here is a summary for each of the DSNHDECP parameters of interest in this discussion (in bold blue text above):
ASCCSID The ASCII SBCS default CCSID
AMCCSID The ASCII MIXED default CCSID
AGCCSID The ASCII DBCS default CCSID
SCCSID The EBCDIC SBCS default CCSID
MCCSID The EBCDIC MIXED default CCSID
GCCSID The EBCDIC DBCS default CCSID
USCCSID The UNICODE SBCS default CCSID (always 367 in Db2)
UMCCSID The UNICODE MIXED default CCSID (always 1208 in Db2)
UGCCSID The UNICODE DBCS default CCSID (always 1200 in Db2)
ENSCHEME This is the default Encoding Scheme if not specified with the
CCSID parameter in the Db2 SQL CREATE statement.
APPENSCH This is the default Encoding Scheme (or CCSID) if not specified
with the ENCODING parameter in the Db2 BIND command.
MIXED Is mixed character data (SBCS and DBCS together) allowed (YES|NO)?
This is discussed in greater detail below.
There are a variety of CCSID triples supported by Db2 for z/OS for both EBCDIC and ASCII. Tables of valid (supported) EBCDIC and ASCII CCSID triplets can be found in the Db2 for z/OS Installation Guide. If you look at these tables, you will see that in our sample DSNHDECP above the defaults selected were for the language named “Simplified Chinese”:
Table 154. EBCDIC double-byte coded character set identifiers (CCSIDs)
National language MCCSID SCCSID GCCSID
Simplified Chinese 935 836 837
Table 155. ASCII double-byte coded character set identifiers (CCSIDs)
National language MCCSID SCCSID GCCSID
Simplified Chinese 1381 1115 1380
The UNICODE triplet is always as follows in Db2 for z/OS:
USCCSID=367,
UMCCSID=1208,
UGCCSID=1200,
For each triplet (regardless of encoding scheme):
o The SBCS CCSID (SCCSID, ASCCSID, and USCCSID) is a 1 byte CCSID (i.e. each character in the CCSID is a 1-byte character).
o The DBCS CCSID (GCCSID, AGCCSID, and UGCCSID) is a 2-4 byte CCSID (i.e. each character in the CCSID is a 2-to-4 bytes character). Note that this CCSID is often referred to as the “GRAPHIC” CCSID in the triplet.
o The MIXED CCSID (MCCSID, AMCCSID, and UMCCSID) is a 1-4 byte CCSID, since it contains all associated SBCS CCSID characters and all associated DBCS CCSID characters in the triplet.
A good practice is to set the EBCDIC triplet to the “green screen” 3270 emulator settings employed by the users of this Db2 subsystem.
A good practice is to set the ASCII triplet to the most commonly used workstation client settings for the users of this Db2 subsystem.
Note that in the above sample, MIXED=YES. The value of MIXED (YES|NO) indicates how the EBCDIC CCSID and ASCII CCSID fields are to be interpreted by Db2.
If MIXED=NO, then the EBCDIC (GCCSID and MCCSID) and ASCII (AGCCSID and AMCCSID) DBCS and MIXED CCSID’s are set to the default 65534, which means NO CCSID. This is because double-byte character data is not allowed for EBCDIC and ASCII character data in this Db2 subsystem, if MIXED=NO.
If MIXED=YES, then double byte character data is allowed for EBCDIC and ASCII character data in this Db2 subsystem. The EBCDIC (GCCSID and MCCSID) and ASCII (AGCCSID and AMCCSID) DBCS and MIXED CCSID’s are set to the appropriate CCSID’s to complete the requested triplet (the matching CCSID’s for the SBCS CCSID), as in our sample above.
Note: Do not confuse CCSID 65534 versus 65535:
CCSID 65534 means NO CCSID (used when MIXED = NO).
CCSID 65535 means NO CONVERSION (used to tag “FOR BIT DATA” data types … see
below for more on this).
UNICODE is always considered to be MIXED=YES, and as previously stated, is always set to USCCSID=367, UMCCSID=1208, and UGCCSID=1200 in Db2. This means that regardless of the value for MIXED in DSNHDECP, UNICODE character data can always contain single-byte and double-byte data.
Side note: NEVER change the values for either MIXED or any CCSID in the DSNHDECP of an existing Db2 subsystem, without first consulting with IBM!!! Per IBM, once these values have been specified, they should not be changed without IBM consultation.
When configuring a Db2 subsystem, we are primarily interested in three (3) issues pertaining to the setting of the DSNHDECP parameters listed above:
1. What CCSID is used to tag character data for storage within a Db2 table?
2. What CCSID is used to interpret incoming character data to Db2?
3. What CCSID is used to interpret incoming SQL during Db2 precompile (PC)?
Below is a discussion on each of the three issues of interest listed above.
1. What CCSID is used to tag character data for storage within a Db2 table (i.e. how is character data “tagged” for storage in Db2)?
Note that in the explanation below, character data is said to have one of the character (e.g. CHAR) data types in Db2.
FIRST: Identify the ENCODING SCHEME that was assigned to the TABLE in the SQL CREATE statements used to create the DATABASE.TABLESPACE.TABLE (DB.TS.TB):
IF CREATE TABLE … includes keyword CCSID UNICODE|EBCDIC|ASCII
THEN … it must match the value assigned at CREATE TS (below).
ELSE … it defaults to the value assigned at CREATE TS (below).
IF CREATE TABLESPACE … includes keyword CCSID UNICODE|EBCDIC|ASCII
THEN … it is assigned to the TS.
ELSE … it defaults to the value assigned at CREATE DB (below).
IF CREATE DATABASE … includes keyword CCSID UNICODE|EBCDIC|ASCII
THEN … it is the default if CREATE TS does not assign CCSID (above).
ELSE … it defaults to the value found in DSNHDECP parameter ENSCHEME.
Note that the CCSID parameter is always optional in the three CREATE statements above, and defaults based on the rules above.
SECOND: Identify the CCSID to use (based on the encoding scheme identified above), the MIXED value in DSNHDECP, and the data type (SBCS, DBCS, MIXED) of the character column:
IF MIXED = NO
IF “FOR MIXED DATA” data type THEN the DB.TS must be UNICODE (CCSID UNICODE in CREATE TS) else you get an error in the CREATE TABLE. The character data in this column is tagged with UNICODE MIXED CCSID 1208.
IF “GRAPHIC” data type THEN the DB.TS must be UNICODE (CCSID UNICODE in CREATE TS) else you get an error in the CREATE TABLE. The character data in this column is tagged with UNICODE DBCS CCSID 1200.
ELSE (not “FOR MIXED DATA” and not “GRAPHIC”) THEN the character data in this column is tagged with the SBCS CCSID of the DB.TS above.
IF MIXED = YES
IF “FOR SBCS DATA” data type THEN the character data in this column is tagged with the SBCS CCSID of the DB.TS above.
IF “GRAPHIC” data type THEN the character data in this column is tagged with the DBCS CCSID of the DB.TS above.
ELSE (not “FOR SBCS DATA” and not “GRAPHIC”) THEN the character data in this column is tagged with the MIXED CCSID of the DB.TS identified above.
2. What CCSID is used to interpret incoming character data to Db2 (e.g. character data passed to Db2 from a PLAN/PACKAGE)?
IF SQL is STATIC THEN
IF PLAN/PKG bound with ENCODING UNICODE|EBCDIC|ASCII|ccsid … THEN use that CCSID or the triplet associated with that encoding scheme to
interpret incoming character data from the PLAN/PKG.
ELSE use the triplet associated with the default encoding scheme in
APPENSCH to interpret incoming character data from the PLAN/PKG.
ELSE (SQL is DYN) use the triplet associated with the encoding scheme specified in the special register DEFAULT APPLICATION ENCODING SCHEME to interpret incoming character data from the PLAN/PACKAGE. The default value for this special register is the BIND option ENCODING (or DSNHDECP parameter APPENSCH if ENCODING not specified in the BIND … see above).
The application sending the character data to Db2 can override the above defaults and explicitly specify the CCSID to be used by Db2 to interpret incoming character data from the application, in one of several ways:
o EXEC SQL SET CURRENT APPLICATION ENCODING SCHEME = ‘…’;
o Any host variable (:hv) in SQL can override the above for CCSID via:
EXEC SQL DECLARE :hv VARIABLE CCSID UNICODE|EBCDIC|ASCII|ccsid;
o There is also a way to do it in SQLDA … see IBM doc for details.
3. What CCSID is used to interpret incoming SQL during Db2 precompile (PC)?
o There is a CCSID keyword in the PC: This PC parm tells the PC how to interpret incoming SQL for PC. It must be an EBCDIC CCSID!!
The default (if not specified) is the EBCDIC triplet in DSNHDECP.
This does not impact SQL :hv and character sting interpretation, only SQL statement interpretation (only the parsing of SQL statements).
From IBM doc: “Since all Db2 SQL keywords (e.g. SELECT) are representable on all EBCDIC code pages, there should not be a problem”.
When the SQL contains literals however, they are subject to data loss.
SQLCODE +355 means “Literal subject to data loss” … the usual substitution character in Db2 for z/OS is x’3F’ (see below for details).
Character conversion occurs when character data is moved from one CCSID to another (i.e. the tag is changed), but there are no hard and fast rules as to when a conversion occurs. A conversion occurs when necessary.
Example: When we store character data in Db2 for z/OS tagged with some EBCDIC CCSID, and we display that data on a client in an ASCII CCSID, the character data may need to be converted (translated) for display.
There are times when character conversion cannot occur, because some of the characters in the source CCSID cannot be mapped to the target CCSID. When that happens, characters in the source string are subject to conversion to a substitution character. The Db2 for z/OS substitution character is always x’3F’, and when this occurs you will see the SQLCODE +355.
Character conversion occurs in Db2 for z/OS transparently to the Db2 user. When Db2 determines that a conversion must occur, it invokes z/OS Conversion Services under the covers to perform the conversion. Once the conversion is completed, Db2 resumes normal processing of the data.
Character conversion is always performed at the receiver … example:
1. A client application sends an SQL SELECT statement to Db2 for z/OS with character data in the predicate.
2. Db2 for z/OS interprets the predicate character data per the above discussion (#2 above), and invokes z/OS Conversion Services if character conversion is required.
3. Db2 for z/OS returns a result set to the client. All character data in the result set is returned to the requester tagged with the CCSID in which it was stored in Db2 (i.e. Db2 does not convert it before sending).
4. The client interprets the character data and performs character conversion as required, via its own mechanisms on the client platform.
UNICODE is a single character set that encodes all the world’s scripts. New characters are being added to UNICODE all the time, as the CCSID continues to grow (to include more and more code pages). Therefore, character conversion is never required for UNICODE character data.
UNICODE attempts to standardize the representation and manipulation of data across vendors and platforms. The UNICODE standard provides a cross platform, cross vendor method of encoding character data that enables lossless representation and manipulation of character data (i.e. no substitution characters should ever be required for UNICODE data).
CCSID 367 is UNICODE SBCS, and is sometimes referred to as 7-bit ASCII.
CCSID 1200 is UNICODE DBCS, and is sometimes referred to as UTF-16.
CCSID 1208 is UNICODE MIXED, and therefore includes everything in CCSID 1200 (DBCS) and CCSID 367 (SBCS). CCSID 1208 is sometimes referred to as UTF-8.
When either EBCDIC or ASCII data is tagged with a MIXED CCSID, it means that the associated character string can contain both single byte and graphic (i.e. double byte) data. There is a methodology for delineating between the single byte and graphic data in the character string. It is called “shift-in” and “shift-out” characters. Db2 uses these special characters in a character string to delineate between single byte and double byte data in the same string.
MIXED character data contains “shift-in” (x’0E’) and “shift-out” (x’0F’) characters to delineate single-byte data from double-byte data. Below are some examples of the use of shift-in and shift-out characters.
Shift-In/Shift-Out and SBCS/DBCS data examples
Example: Create a MIXED table name with a MIXED column name:
------------------------------------------------------------------------
CREATE TABLE ABCOWN." âDâEâPâTâX MIXED TBNAME"
44CDCCEC4ECCDC4CCCDED4704C4C4D4E4E0DCECC4ECDCDC7444444444444444444444444
003951350312350123665BFE2425272327F497540325145F000000000000000000000000
-----------------------------------------------------------------------
(COL01 CHAR(1) NOT NULL,
4444444444CDDFF4444444444444444444444444CCCD4F544DDE4DEDD644444444444444
000000000D3630100000000000000000000000003819D1D0056305433B00000000000000
-----------------------------------------------------------------------
" âSâUâ0âNâO MIXED COLNAME" CHAR(2))
444444444704E4E4F4D4D0DCECC4CDDDCDC74444CCCD4F55444444444444444444444444
000000000FE2224202526F4975403635145F00003819D2DD000000000000000000000000
-----------------------------------------------------------------------
IN ABCDB.ABCTS;
444444444CD4CCCCC4CCCEE5444444444444444444444444444444444444444444444444
00000000095012342B12332E000000000000000000000000000000000000000000000000
------------------------------------------------------------------------
SELECT * FROM SYSIBM.SYSTABLES WHERE DBNAME = 'ABCDB';
NAME CREATOR
-----------------------------------------------------------------------
âDâEâPâTâX MIXED TBNAME ABCOWN
04C4C4D4E4E0DCECC4ECDCDC444CCCDED44444444444444444444444444444444444444
E2425272327F49754032514500012366500000000000000000000000000000000000000
-----------------------------------------------------------------------
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
-----------------------------------------------------------------------
SELECT * FROM SYSIBM.SYSCOLUMNS WHERE TBCREATOR = 'ABCOWN';
NAME TBNAME TBCREATOR
------------------------------------------------------------------------
âSâUâ0âNâO MIXED COLNAME âDâEâPâTâX MIXED TBNAME ABCOWN
04E4E4F4D4D0DCECC4CDDDCDC4404C4C4D4E4E0DCECC4ECDCDC444CCCDED444444444444
E2224202526F497540363514500E2425272327F497540325145000123665000000000000
------------------------------------------------------------------------
COL01 âDâEâPâTâX MIXED TBNAME ABCOWN
CDDFF444444444444444444444404C4C4D4E4E0DCECC4ECDCDC444CCCDED444444444444
363010000000000000000000000E2425272327F497540325145000123665000000000000
------------------------------------------------------------------------
DSNE610I NUMBER OF ROWS DISPLAYED IS 2
------------------------------------------------------------------------
The above TABLE NAME (âDâEâPâTâX MIXED TBNAME) has both SBCS (single-byte) and DBCS (double-byte) data embedded in the name.
The above COLUMN NAME (âSâUâ0âNâO MIXED COLNAME) has both SBCS (single-byte) and DBCS (double-byte) data embedded in the name.
Example: To insert DBCS CHAR FOR MIXED DATA and GRAPHIC data:
-------------------------------------------------------------------------
CREATE TABLESPACE ABCTS IN ABCDB
BUFFERPOOL BP8K0
CCSID UNICODE
USING STOGROUP SYSDEFLT PRIQTY 128 SECQTY 128;
CREATE TABLE ABCOWN.ABCTB
( C1 INT NOT NULL
,C2 CHAR(50) NOT NULL FOR MIXED DATA
,C3 GRAPHIC(50) NOT NULL )
IN ABCDB.ABCTS;
------------------------------------------------------------------------
INSERT INTO ABCOWN.ABCTB VALUES(
CDECDE4CDED4CCCDED4CCCEC4ECDECE44444444444444444444444444444444444444444
952593095360123665B123320513452D0000000000000000000000000000000000000000
-----------------------------------------------------------------------
11,
44444444444444FF64444444444444444444444444444444444444444444444444444444
0000000000000011B0000000000000000000000000000000000000000000000000000000
-----------------------------------------------------------------------
' á"ã"áÃáká"á»áÊácá" FOR MIXED DATA',
44444444444444704747464947484748470CDD4DCECC4CCEC76444444444444444444444
00000000000000DE5F6F56525F5B52535FF66904975404131DB000000000000000000000
-----------------------------------------------------------------------
G' àaàbàcàdàe ');
4444444444444C7048484848480755444444444444444444444444444444444444444444
00000000000007DE4142434445FDDE000000000000000000000000000000000000000000
-----------------------------------------------------------------------
SELECT * FROM ABCOWN.ABCTB;
C1 C2 C3
------------------------------------------------------------------------
11 á"ã"áÃáká"á»áÊácá" FOR MIXED DATA àaàbàcàdàe
44444444FF4404747464947484748470CDD4DCECC4CCEC44444444048484848480444444
000000001100E5F6F56525F5B52535FF6690497540413100000000E4142434445F000000
------------------------------------------------------------------------
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
------------------------------------------------------------------------
Note that the TABLESPACE was created with CCSID UNICODE. In a MIXED=NO Db2 subsystem, the CREATE TABLE would have failed (SQLCODE -622) if the encoding scheme assigned to the table being created was not a UNICODE table. In a MIXED=YES Db2 subsystem, the encoding scheme assigned to the table could have been EBCDIC, ASCII, or UNICODE.
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- transformation responses consolidation v1
- an introduction to stata for health researchers
- miscellaneous notes
- establishing a connections
- unicode support in ebcdic based system
- east tennessee state university
- ifs internationalization support unicode
- vb objects for marc and utf 8
- file format wikipedia the free encyclopedia
- xul creating localizable xml gui