Miscellaneous Notes



Db2 for z/OS CCSID Overview

Db2 for z/OS leverages the z/OS operating system’s Conversion Services facilities to enable Db2 support of character data stored in various encoding schemes and code pages. These z/OS services enable Db2 for z/OS to work with (i.e. receive as input, process, store, and retrieve) character data represented by the various code pages in the encoding schemes EBCDIC, ASCII, and UNICODE.

This paper summarizes, at a moderately technical level, the ways and methods in which Db2 for z/OS can be configured and employed to manage character data from a disparate set of encoding schemes and code pages. It also discusses the ways in which Db2 for z/OS leverages z/OS Conversion Services for the purposes of managing and manipulating character data.

For the remainder of this discussion, the term Db2 will mean Db2 for z/OS (as opposed to Db2 LUW, etc.). This discussion is exclusively for Db2 for z/OS, where the term Db2 is concerned and used. References to z/OS are exclusively for those services and programs executing in a z/OS operating system.

This discussion pertains only to character data, never to numeric data or binary (BIT) data. For this purpose, one should note the difference between the character ‘1’ and the number 1. ‘1’ is a character data type, 1 is a numeric data type. This discussion only pertains to character data types. Numeric and binary (BIT) data in Db2 are not a part of this discussion.

DISCLAIMER: The following discussion is a personal interpretation of the IBM provided documentation on these topics. Because interpretations of technical documentation may vary from person to person, and are therefore subject to scrutiny, the user should always rely on IBM documentation for the definitive and authoritative explanation of each topic.

This paper is as of Db2 12 for z/OS and z/OS 2.2, and was written on Oct. 12, 2017 by:

Kurt Bohnert

Manager, Db2 Systems Programming

Rocket Software, Inc.

In computing, an “encoding scheme” can be defined as a set of specific definitions and rules that describe a philosophy used to represent character data. One may think of an encoding scheme as simply a set of rules used for representing character data.

Some examples of encoding scheme rules might include (but are not limited to) the following: The number of bits used for each character, the number of bytes used for each character, the allowable ranges of bytes, maximum number of characters, … there are many other possible rules that might apply in an encoding scheme. The rules of an encoding scheme govern the way in which characters in the encoding scheme are represented in computing.

Db2 for z/OS is enabled to work with (manage) data governed by any one of the following three encoding schemes: EBCDIC, ASCII, and UNICODE.

Within each encoding scheme, there can be many “code pages”. A code page consists of a table of values that describes the character set for a given language. Each code page in an encoding scheme must conform to the specific rules for that encoding scheme. The various code pages in an encoding scheme are different however, in that they may represent different languages (as one example), yet they all conform to the rules of the encoding scheme in which they reside. There will be examples of this later in this paper.

IBM Db2 employs an IBM concept called a “CCSID”. CCSID stands for Coded Character Set ID. A CCSID identifies an encoding scheme and a code page for that encoding scheme. Although a CCSID is technically a more precise definition of a code page, one may think of a CCSID and a code page as synonymous for the remainder of this paper. In those cases where it is important to distinguish between the two, the distinction will be made.

Db2 assigns a CCSID to all character data it manages. All character data passed to Db2 from an application/agent is assigned a CCSID. The character data is said to be “tagged” with the CCSID. The CCSID is stored with the character data if the data is stored in a Db2 data base. It becomes part of the “metadata” for that character data. Metadata is “data about data”. Metadata contains information about the data being managed. CCSID is always a part of the metadata for character data stored in Db2 and manipulated by Db2.

CCSID’s in Db2 are grouped in three’s … referred to in this discussion as “triplets”. One of the three CCSID’s in a triplet is the Single-Byte-Character-Set (SBCS), one of the three CCSID’s is the Double-Byte-Character-Set (DBCS), and one of the three CCSID’s is the MIXED character set. This will be discussed in greater detail below. The important point to note for now is that Db2 CCSID’s come in three’s … in this discussion referred to as triplets.

So CCSID’s are Db2’s implementation of character data management within the Db2 DBMS (Data Base Management System). From this point forward in the paper, we are only concerned with how, when, and where Db2 employs CCSID’s in support of character data interpretation, manipulation, and storage.

Db2 recognizes many CCSID triplets for both the EBCDIC and ASCII encoding schemes. Again, a triplet is three CCSID’s … a SBCS CCSID, a DBCS CCSID, and a MIXED CCSID. Db2 only recognizes one CCSID triplet for UNICODE however, because UNICODE (unlike EBCDIC and ASCII) only contains one triplet. As will be seen later in this paper, UNICODE is an industry wide attempt to standardize and unify all character schemes (i.e. all the various characters) from all over the computing world, into a single triplet. UNICODE can be thought of as a single code page (triplet) containing all characters from all other encoding schemes and code pages. This will be discussed in greater detail below.

It should be noted here that (as previously stated) a CCSID is a more precise definition of a code page. This means (among other things) that there can be a one-to-many relationship between code page and CCSID.

Here is an example of two CCSID’s in the same code page within the EBCDIC encoding scheme. EBCDIC code page 37 has (at least) two CCSID’s. They are CCSID 0037 and CCSID 1140. The only difference between the two CCSID’s is at code point 9F in the two CCSID tables. In CCSID 0037 you will find the "¤" (currency sign) character at code point 9F, but in CCSID 1140 at code point 9F you will find the "€" (Euro sign) character. CCSID 0037 might be considered a standard SBCS CCSID in North America, and CCSID 1140 a standard SBCS CCSID in Europe.

On the next page is a copy of the CCSID table for CCSID 0037. It was cut and paste off the internet. You can find many such examples of CCSID’s and code pages on the internet. In fact, you can likely find the table for every CCSID and code page ever constructed, on the internet.

Note in the example below that the code point 9F contains the "¤" (currency sign) character. If one looked at CCSID 1140, one would see that the only difference between the two CCSID’s is at this code point (9F), where CCSID 1140 contains the "€" (Euro sign) character.

CCSID 1140 is the Euro currency update of CCSID 0037. In CCSID 1140, the "¤" (currency sign) character at code point 9F is replaced with the "€" (Euro sign) character.

See code point 9F in red, in the table below.

|CCSID 0037 |

|_0 |_1 |_2 |_3 |_4 |_5 |_6 |_7 |_8 |_9 |_A |_B |_C |_D |_E |_F | | 

0_

  |NUL

0000

0 |SOH

0001

1 |STX

0002

2 |ETX

0003

3 |SEL

009C

4 |HT

0009

5 |RNL

0086

6 |DEL

007F

7 |GE

0097

8 |SPS

008D

9 |RPT

008E

10 |VT

000B

11 |FF

000C

12 |CR

000D

13 |SO

000E

14 |SI

000F

15 | | 

1_

  |DLE

0010

16 |DC1

0011

17 |DC2

0012

18 |DC3

0013

19 |RES ENP

009D

20 |NL

0085

21 |BS

0008

22 |POC

0087

23 |CAN

0018

24 |EM

0019

25 |UBS

0092

26 |CU1

008F

27 |IFS

001C

28 |IGS

001D

29 |IRS

001E

30 |IUS ITB

001F

31 | | 

2_

  |DS

0080

32 |SOS

0081

33 |FS

0082

34 |WUS

0083

35 |BYP INP

0084

36 |LF

000A

37 |ETB

0017

38 |ESC

001B

39 |SA

0088

40 |SFE

0089

41 |SM SW

008A

42 |CSP

008B

43 |MFA

008C

44 |ENQ

0005

45 |ACK

0006

46 |BEL

0007

47 | | 

3_

  |

0090

48 |

0091

49 |SYN

0016

50 |IR

0093

51 |PP

0094

52 |TRN

0095

53 |NBS

0096

54 |EOT

0004

55 |SBS

0098

56 |IT

0099

57 |RFF

009A

58 |CU3

009B

59 |DC4

0014

60 |NAK

0015

61 |

009E

62 |SUB

001A

63 | | 

4_

  |SP

0020

64 |RSP

00A0

65 |â

00E2

66 |ä

00E4

67 |à

00E0

68 |á

00E1

69 |ã

00E3

70 |å

00E5

71 |ç

00E7

72 |ñ

00F1

73 |¢

00A2

74 |.

002E

75 |<

003C

76 |(

0028

77 |+

002B

78 ||

007C

79 | | 

5_

  |&

0026

80 |é

00E9

81 |ê

00EA

82 |ë

00EB

83 |è

00E8

84 |í

00ED

85 |î

00EE

86 |ï

00EF

87 |ì

00EC

88 |ß

00DF

89 |!

0021

90 |$

0024

91 |*

002A

92 |)

0029

93 |;

003B

94 |¬

00AC

95 | | 

6_

  |-

002D

96 |/

002F

97 |Â

00C2

98 |Ä

00C4

99 |À

00C0

100 |Á

00C1

101 |Ã

00C3

102 |Å

00C5

103 |Ç

00C7

104 |Ñ

00D1

105 |¦

00A6

106 |,

002C

107 |%

0025

108 |_

005F

109 |>

003E

110 |?

003F

111 | | 

7_

  |ø

00F8

112 |É

00C9

113 |Ê

00CA

114 |Ë

00CB

115 |È

00C8

116 |Í

00CD

117 |Î

00CE

118 |Ï

00CF

119 |Ì

00CC

120 |`

0060

121 |:

003A

122 |#

0023

123 |@

0040

124 |'

0027

125 |=

003D

126 |"

0022

127 | | 

8_

  |Ø

00D8

128 |a

0061

129 |b

0062

130 |c

0063

131 |d

0064

132 |e

0065

133 |f

0066

134 |g

0067

135 |h

0068

136 |i

0069

137 |«

00AB

138 |»

00BB

139 |ð

00F0

140 |ý

00FD

141 |þ

00FE

142 |±

00B1

143 | | 

9_

  |°

00B0

144 |j

006A

145 |k

006B

146 |l

006C

147 |m

006D

148 |n

006E

149 |o

006F

150 |p

0070

151 |q

0071

152 |r

0072

153 |ª

00AA

154 |º

00BA

155 |æ

00E6

156 |¸

00B8

157 |Æ

00C6

158 |¤

00A4

159 | | 

A_

  |µ

00B5

160 |~

007E

161 |s

0073

162 |t

0074

163 |u

0075

164 |v

0076

165 |w

0077

166 |x

0078

167 |y

0079

168 |z

007A

169 |¡

00A1

170 |¿

00BF

171 |Ð

00D0

172 |Ý

00DD

173 |Þ

00DE

174 |®

00AE

175 | | 

B_

  |^

005E

176 |£

00A3

177 |¥

00A5

178 |·

00B7

179 |©

00A9

180 |§

00A7

181 |¶

00B6

182 |¼

00BC

183 |½

00BD

184 |¾

00BE

185 |[

005B

186 |]

005D

187 |¯

00AF

188 |¨

00A8

189 |´

00B4

190 |×

00D7

191 | | 

C_

  |{

007B

192 |A

0041

193 |B

0042

194 |C

0043

195 |D

0044

196 |E

0045

197 |F

0046

198 |G

0047

199 |H

0048

200 |I

0049

201 |SHY

00AD

202 |ô

00F4

203 |ö

00F6

204 |ò

00F2

205 |ó

00F3

206 |õ

00F5

207 | | 

D_

  |}

007D

208 |J

004A

209 |K

004B

210 |L

004C

211 |M

004D

212 |N

004E

213 |O

004F

214 |P

0050

215 |Q

0051

216 |R

0052

217 |¹

00B9

218 |û

00FB

219 |ü

00FC

220 |ù

00F9

221 |ú

00FA

222 |ÿ

00FF

223 | | 

E_

  |\

005C

224 |÷

00F7

225 |S

0053

226 |T

0054

227 |U

0055

228 |V

0056

229 |W

0057

230 |X

0058

231 |Y

0059

232 |Z

005A

233 |²

00B2

234 |Ô

00D4

235 |Ö

00D6

236 |Ò

00D2

237 |Ó

00D3

238 |Õ

00D5

239 | | 

F_

  |0

0030

240 |1

0031

241 |2

0032

242 |3

0033

243 |4

0034

244 |5

0035

245 |6

0036

246 |7

0037

247 |8

0038

248 |9

0039

249 |³

00B3

250 |Û

00DB

251 |Ü

00DC

252 |Ù

00D9

253 |Ú

00DA

254 |EO

009F

255 | | |_0 |_1 |_2 |_3 |_4 |_5 |_6 |_7 |_8 |_9 |_A |_B |_C |_D |_E |_F | |

Now that we have defined many of the pertinent terms, it is time to discuss how, when, and where Db2 employs CCSID’s in support of character data interpretation, manipulation, and storage. The important point to keep in mind during this discussion is this: Every character string used in a Db2 SQL operation, and every character string manipulated and/or stored by Db2 in any way, is tagged with a CCSID.

Recall that CCSID’s in Db2 come in three’s … there are three (3) CCSID’s associated together into a related group (a triplet):

o SBCS Single Byte Character Set CCSID

o DBCS Double Byte Character Set (also called GRAPHIC) CCSID

o MIXED Contains both SBCS data and DBCS data. SBCS and DBCS

are both always a subset of MIXED for their triplet

All CCSID triplets reside within one of the Encoding Schemes supported by Db2 for z/OS: EBCDIC, ASCII, or UNICODE.

When a Db2 subsystem is initially configured (built), there is an Application Programming Defaults load module created named DSNHDECP. This load module contains several application default parameters which are directly associated with this discussion. Here is a sample list of all parameters defined in DSNHDECP for Db2 V11. The parameters of interest in this discussion are in bold blue text:

ASCCSID=1115,

AMCCSID=1381,

AGCCSID=1380,

SCCSID=836,

MCCSID=935,

GCCSID=837,

USCCSID=367,

UMCCSID=1208,

UGCCSID=1200,

ENSCHEME=EBCDIC,

APPENSCH=EBCDIC,

DATE=USA,

DATELEN=0,

DECARTH=DEC15,

DECIMAL=COMMA,

DEF_DECFLOAT_ROUND_MODE=ROUND_HALF_EVEN,

DEFLANG=IBMCOB,

DELIM=DEFAULT,

IMPLICIT_TIMEZONE=CURRENT,

MIXED=YES,

NEWFUN=V11,

PADNTSTR=YES,

SQLDELI=DEFAULT,

DSQLDELI=APOST,

SSID=QB1M,

STDSQL=NO,

TIME=JIS,

TIMELEN=0,

DYNRULS=YES,

LC_CTYPE=

A default CCSID triplet is assigned for each of the three encoding schemes employed by Db2 (EBCDIC, ASCII, and UNICODE) in DSNHDECP (above) at subsystem installation. In our example above the defined defaults are:

ASCCSID=1115,

AMCCSID=1381,

AGCCSID=1380,

SCCSID=836,

MCCSID=935,

GCCSID=837,

USCCSID=367,

UMCCSID=1208,

UGCCSID=1200,

Along with these default CCSID triplets, three other related parameters are assigned in DSNHDECP:

ENSCHEME=EBCDIC,

APPENSCH=EBCDIC,

MIXED=YES,

Here is a summary for each of the DSNHDECP parameters of interest in this discussion (in bold blue text above):

ASCCSID The ASCII SBCS default CCSID

AMCCSID The ASCII MIXED default CCSID

AGCCSID The ASCII DBCS default CCSID

SCCSID The EBCDIC SBCS default CCSID

MCCSID The EBCDIC MIXED default CCSID

GCCSID The EBCDIC DBCS default CCSID

USCCSID The UNICODE SBCS default CCSID (always 367 in Db2)

UMCCSID The UNICODE MIXED default CCSID (always 1208 in Db2)

UGCCSID The UNICODE DBCS default CCSID (always 1200 in Db2)

ENSCHEME This is the default Encoding Scheme if not specified with the

CCSID parameter in the Db2 SQL CREATE statement.

APPENSCH This is the default Encoding Scheme (or CCSID) if not specified

with the ENCODING parameter in the Db2 BIND command.

MIXED Is mixed character data (SBCS and DBCS together) allowed (YES|NO)?

This is discussed in greater detail below.

There are a variety of CCSID triples supported by Db2 for z/OS for both EBCDIC and ASCII. Tables of valid (supported) EBCDIC and ASCII CCSID triplets can be found in the Db2 for z/OS Installation Guide. If you look at these tables, you will see that in our sample DSNHDECP above the defaults selected were for the language named “Simplified Chinese”:

Table 154. EBCDIC double-byte coded character set identifiers (CCSIDs)

National language MCCSID SCCSID GCCSID

Simplified Chinese 935 836 837

Table 155. ASCII double-byte coded character set identifiers (CCSIDs)

National language MCCSID SCCSID GCCSID

Simplified Chinese 1381 1115 1380

The UNICODE triplet is always as follows in Db2 for z/OS:

USCCSID=367,

UMCCSID=1208,

UGCCSID=1200,

For each triplet (regardless of encoding scheme):

o The SBCS CCSID (SCCSID, ASCCSID, and USCCSID) is a 1 byte CCSID (i.e. each character in the CCSID is a 1-byte character).

o The DBCS CCSID (GCCSID, AGCCSID, and UGCCSID) is a 2-4 byte CCSID (i.e. each character in the CCSID is a 2-to-4 bytes character). Note that this CCSID is often referred to as the “GRAPHIC” CCSID in the triplet.

o The MIXED CCSID (MCCSID, AMCCSID, and UMCCSID) is a 1-4 byte CCSID, since it contains all associated SBCS CCSID characters and all associated DBCS CCSID characters in the triplet.

A good practice is to set the EBCDIC triplet to the “green screen” 3270 emulator settings employed by the users of this Db2 subsystem.

A good practice is to set the ASCII triplet to the most commonly used workstation client settings for the users of this Db2 subsystem.

Note that in the above sample, MIXED=YES. The value of MIXED (YES|NO) indicates how the EBCDIC CCSID and ASCII CCSID fields are to be interpreted by Db2.

If MIXED=NO, then the EBCDIC (GCCSID and MCCSID) and ASCII (AGCCSID and AMCCSID) DBCS and MIXED CCSID’s are set to the default 65534, which means NO CCSID. This is because double-byte character data is not allowed for EBCDIC and ASCII character data in this Db2 subsystem, if MIXED=NO.

If MIXED=YES, then double byte character data is allowed for EBCDIC and ASCII character data in this Db2 subsystem. The EBCDIC (GCCSID and MCCSID) and ASCII (AGCCSID and AMCCSID) DBCS and MIXED CCSID’s are set to the appropriate CCSID’s to complete the requested triplet (the matching CCSID’s for the SBCS CCSID), as in our sample above.

Note: Do not confuse CCSID 65534 versus 65535:

CCSID 65534 means NO CCSID (used when MIXED = NO).

CCSID 65535 means NO CONVERSION (used to tag “FOR BIT DATA” data types … see

below for more on this).

UNICODE is always considered to be MIXED=YES, and as previously stated, is always set to USCCSID=367, UMCCSID=1208, and UGCCSID=1200 in Db2. This means that regardless of the value for MIXED in DSNHDECP, UNICODE character data can always contain single-byte and double-byte data.

Side note: NEVER change the values for either MIXED or any CCSID in the DSNHDECP of an existing Db2 subsystem, without first consulting with IBM!!! Per IBM, once these values have been specified, they should not be changed without IBM consultation.

When configuring a Db2 subsystem, we are primarily interested in three (3) issues pertaining to the setting of the DSNHDECP parameters listed above:

1. What CCSID is used to tag character data for storage within a Db2 table?

2. What CCSID is used to interpret incoming character data to Db2?

3. What CCSID is used to interpret incoming SQL during Db2 precompile (PC)?

Below is a discussion on each of the three issues of interest listed above.

1. What CCSID is used to tag character data for storage within a Db2 table (i.e. how is character data “tagged” for storage in Db2)?

Note that in the explanation below, character data is said to have one of the character (e.g. CHAR) data types in Db2.

FIRST: Identify the ENCODING SCHEME that was assigned to the TABLE in the SQL CREATE statements used to create the DATABASE.TABLESPACE.TABLE (DB.TS.TB):

IF CREATE TABLE … includes keyword CCSID UNICODE|EBCDIC|ASCII

THEN … it must match the value assigned at CREATE TS (below).

ELSE … it defaults to the value assigned at CREATE TS (below).

IF CREATE TABLESPACE … includes keyword CCSID UNICODE|EBCDIC|ASCII

THEN … it is assigned to the TS.

ELSE … it defaults to the value assigned at CREATE DB (below).

IF CREATE DATABASE … includes keyword CCSID UNICODE|EBCDIC|ASCII

THEN … it is the default if CREATE TS does not assign CCSID (above).

ELSE … it defaults to the value found in DSNHDECP parameter ENSCHEME.

Note that the CCSID parameter is always optional in the three CREATE statements above, and defaults based on the rules above.

SECOND: Identify the CCSID to use (based on the encoding scheme identified above), the MIXED value in DSNHDECP, and the data type (SBCS, DBCS, MIXED) of the character column:

IF MIXED = NO

IF “FOR MIXED DATA” data type THEN the DB.TS must be UNICODE (CCSID UNICODE in CREATE TS) else you get an error in the CREATE TABLE. The character data in this column is tagged with UNICODE MIXED CCSID 1208.

IF “GRAPHIC” data type THEN the DB.TS must be UNICODE (CCSID UNICODE in CREATE TS) else you get an error in the CREATE TABLE. The character data in this column is tagged with UNICODE DBCS CCSID 1200.

ELSE (not “FOR MIXED DATA” and not “GRAPHIC”) THEN the character data in this column is tagged with the SBCS CCSID of the DB.TS above.

IF MIXED = YES

IF “FOR SBCS DATA” data type THEN the character data in this column is tagged with the SBCS CCSID of the DB.TS above.

IF “GRAPHIC” data type THEN the character data in this column is tagged with the DBCS CCSID of the DB.TS above.

ELSE (not “FOR SBCS DATA” and not “GRAPHIC”) THEN the character data in this column is tagged with the MIXED CCSID of the DB.TS identified above.

2. What CCSID is used to interpret incoming character data to Db2 (e.g. character data passed to Db2 from a PLAN/PACKAGE)?

IF SQL is STATIC THEN

IF PLAN/PKG bound with ENCODING UNICODE|EBCDIC|ASCII|ccsid … THEN use that CCSID or the triplet associated with that encoding scheme to

interpret incoming character data from the PLAN/PKG.

ELSE use the triplet associated with the default encoding scheme in

APPENSCH to interpret incoming character data from the PLAN/PKG.

ELSE (SQL is DYN) use the triplet associated with the encoding scheme specified in the special register DEFAULT APPLICATION ENCODING SCHEME to interpret incoming character data from the PLAN/PACKAGE. The default value for this special register is the BIND option ENCODING (or DSNHDECP parameter APPENSCH if ENCODING not specified in the BIND … see above).

The application sending the character data to Db2 can override the above defaults and explicitly specify the CCSID to be used by Db2 to interpret incoming character data from the application, in one of several ways:

o EXEC SQL SET CURRENT APPLICATION ENCODING SCHEME = ‘…’;

o Any host variable (:hv) in SQL can override the above for CCSID via:

EXEC SQL DECLARE :hv VARIABLE CCSID UNICODE|EBCDIC|ASCII|ccsid;

o There is also a way to do it in SQLDA … see IBM doc for details.

3. What CCSID is used to interpret incoming SQL during Db2 precompile (PC)?

o There is a CCSID keyword in the PC: This PC parm tells the PC how to interpret incoming SQL for PC. It must be an EBCDIC CCSID!!

The default (if not specified) is the EBCDIC triplet in DSNHDECP.

This does not impact SQL :hv and character sting interpretation, only SQL statement interpretation (only the parsing of SQL statements).

From IBM doc: “Since all Db2 SQL keywords (e.g. SELECT) are representable on all EBCDIC code pages, there should not be a problem”.

When the SQL contains literals however, they are subject to data loss.

SQLCODE +355 means “Literal subject to data loss” … the usual substitution character in Db2 for z/OS is x’3F’ (see below for details).

Character conversion occurs when character data is moved from one CCSID to another (i.e. the tag is changed), but there are no hard and fast rules as to when a conversion occurs. A conversion occurs when necessary.

Example: When we store character data in Db2 for z/OS tagged with some EBCDIC CCSID, and we display that data on a client in an ASCII CCSID, the character data may need to be converted (translated) for display.

There are times when character conversion cannot occur, because some of the characters in the source CCSID cannot be mapped to the target CCSID. When that happens, characters in the source string are subject to conversion to a substitution character. The Db2 for z/OS substitution character is always x’3F’, and when this occurs you will see the SQLCODE +355.

Character conversion occurs in Db2 for z/OS transparently to the Db2 user. When Db2 determines that a conversion must occur, it invokes z/OS Conversion Services under the covers to perform the conversion. Once the conversion is completed, Db2 resumes normal processing of the data.

Character conversion is always performed at the receiver … example:

1. A client application sends an SQL SELECT statement to Db2 for z/OS with character data in the predicate.

2. Db2 for z/OS interprets the predicate character data per the above discussion (#2 above), and invokes z/OS Conversion Services if character conversion is required.

3. Db2 for z/OS returns a result set to the client. All character data in the result set is returned to the requester tagged with the CCSID in which it was stored in Db2 (i.e. Db2 does not convert it before sending).

4. The client interprets the character data and performs character conversion as required, via its own mechanisms on the client platform.

UNICODE is a single character set that encodes all the world’s scripts. New characters are being added to UNICODE all the time, as the CCSID continues to grow (to include more and more code pages). Therefore, character conversion is never required for UNICODE character data.

UNICODE attempts to standardize the representation and manipulation of data across vendors and platforms. The UNICODE standard provides a cross platform, cross vendor method of encoding character data that enables lossless representation and manipulation of character data (i.e. no substitution characters should ever be required for UNICODE data).

CCSID 367 is UNICODE SBCS, and is sometimes referred to as 7-bit ASCII.

CCSID 1200 is UNICODE DBCS, and is sometimes referred to as UTF-16.

CCSID 1208 is UNICODE MIXED, and therefore includes everything in CCSID 1200 (DBCS) and CCSID 367 (SBCS). CCSID 1208 is sometimes referred to as UTF-8.

When either EBCDIC or ASCII data is tagged with a MIXED CCSID, it means that the associated character string can contain both single byte and graphic (i.e. double byte) data. There is a methodology for delineating between the single byte and graphic data in the character string. It is called “shift-in” and “shift-out” characters. Db2 uses these special characters in a character string to delineate between single byte and double byte data in the same string.

MIXED character data contains “shift-in” (x’0E’) and “shift-out” (x’0F’) characters to delineate single-byte data from double-byte data. Below are some examples of the use of shift-in and shift-out characters.

Shift-In/Shift-Out and SBCS/DBCS data examples

Example: Create a MIXED table name with a MIXED column name:

------------------------------------------------------------------------

CREATE TABLE ABCOWN." âDâEâPâTâX MIXED TBNAME"

44CDCCEC4ECCDC4CCCDED4704C4C4D4E4E0DCECC4ECDCDC7444444444444444444444444

003951350312350123665BFE2425272327F497540325145F000000000000000000000000

-----------------------------------------------------------------------

(COL01 CHAR(1) NOT NULL,

4444444444CDDFF4444444444444444444444444CCCD4F544DDE4DEDD644444444444444

000000000D3630100000000000000000000000003819D1D0056305433B00000000000000

-----------------------------------------------------------------------

" âSâUâ0âNâO MIXED COLNAME" CHAR(2))

444444444704E4E4F4D4D0DCECC4CDDDCDC74444CCCD4F55444444444444444444444444

000000000FE2224202526F4975403635145F00003819D2DD000000000000000000000000

-----------------------------------------------------------------------

IN ABCDB.ABCTS;

444444444CD4CCCCC4CCCEE5444444444444444444444444444444444444444444444444

00000000095012342B12332E000000000000000000000000000000000000000000000000

------------------------------------------------------------------------

SELECT * FROM SYSIBM.SYSTABLES WHERE DBNAME = 'ABCDB';

NAME CREATOR

-----------------------------------------------------------------------

âDâEâPâTâX MIXED TBNAME ABCOWN

04C4C4D4E4E0DCECC4ECDCDC444CCCDED44444444444444444444444444444444444444

E2425272327F49754032514500012366500000000000000000000000000000000000000

-----------------------------------------------------------------------

DSNE610I NUMBER OF ROWS DISPLAYED IS 1

-----------------------------------------------------------------------

SELECT * FROM SYSIBM.SYSCOLUMNS WHERE TBCREATOR = 'ABCOWN';

NAME TBNAME TBCREATOR

------------------------------------------------------------------------

âSâUâ0âNâO MIXED COLNAME âDâEâPâTâX MIXED TBNAME ABCOWN

04E4E4F4D4D0DCECC4CDDDCDC4404C4C4D4E4E0DCECC4ECDCDC444CCCDED444444444444

E2224202526F497540363514500E2425272327F497540325145000123665000000000000

------------------------------------------------------------------------

COL01 âDâEâPâTâX MIXED TBNAME ABCOWN

CDDFF444444444444444444444404C4C4D4E4E0DCECC4ECDCDC444CCCDED444444444444

363010000000000000000000000E2425272327F497540325145000123665000000000000

------------------------------------------------------------------------

DSNE610I NUMBER OF ROWS DISPLAYED IS 2

------------------------------------------------------------------------

The above TABLE NAME (âDâEâPâTâX MIXED TBNAME) has both SBCS (single-byte) and DBCS (double-byte) data embedded in the name.

The above COLUMN NAME (âSâUâ0âNâO MIXED COLNAME) has both SBCS (single-byte) and DBCS (double-byte) data embedded in the name.

Example: To insert DBCS CHAR FOR MIXED DATA and GRAPHIC data:

-------------------------------------------------------------------------

CREATE TABLESPACE ABCTS IN ABCDB

BUFFERPOOL BP8K0

CCSID UNICODE

USING STOGROUP SYSDEFLT PRIQTY 128 SECQTY 128;

CREATE TABLE ABCOWN.ABCTB

( C1 INT NOT NULL

,C2 CHAR(50) NOT NULL FOR MIXED DATA

,C3 GRAPHIC(50) NOT NULL )

IN ABCDB.ABCTS;

------------------------------------------------------------------------

INSERT INTO ABCOWN.ABCTB VALUES(

CDECDE4CDED4CCCDED4CCCEC4ECDECE44444444444444444444444444444444444444444

952593095360123665B123320513452D0000000000000000000000000000000000000000

-----------------------------------------------------------------------

11,

44444444444444FF64444444444444444444444444444444444444444444444444444444

0000000000000011B0000000000000000000000000000000000000000000000000000000

-----------------------------------------------------------------------

' á"ã"áÃáká"á»áÊácá" FOR MIXED DATA',

44444444444444704747464947484748470CDD4DCECC4CCEC76444444444444444444444

00000000000000DE5F6F56525F5B52535FF66904975404131DB000000000000000000000

-----------------------------------------------------------------------

G' àaàbàcàdàe ');

4444444444444C7048484848480755444444444444444444444444444444444444444444

00000000000007DE4142434445FDDE000000000000000000000000000000000000000000

-----------------------------------------------------------------------

SELECT * FROM ABCOWN.ABCTB;

C1 C2 C3

------------------------------------------------------------------------

11 á"ã"áÃáká"á»áÊácá" FOR MIXED DATA àaàbàcàdàe

44444444FF4404747464947484748470CDD4DCECC4CCEC44444444048484848480444444

000000001100E5F6F56525F5B52535FF6690497540413100000000E4142434445F000000

------------------------------------------------------------------------

DSNE610I NUMBER OF ROWS DISPLAYED IS 1

------------------------------------------------------------------------

Note that the TABLESPACE was created with CCSID UNICODE. In a MIXED=NO Db2 subsystem, the CREATE TABLE would have failed (SQLCODE -622) if the encoding scheme assigned to the table being created was not a UNICODE table. In a MIXED=YES Db2 subsystem, the encoding scheme assigned to the table could have been EBCDIC, ASCII, or UNICODE.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download