Miscellaneous Notes - The Rocket Software Blog



Db2 for z/OS BIND Optimization and Related Topics Overview

The Db2 for z/OS Optimizer is among the most complex components in the Db2 for z/OS DBMS (Data Base Management System). It is primarily responsible for selecting the access path to all data accessed in Db2 for z/OS. The optimizer code has evolved over many years, and is the one of the most influential components impacting Db2 for z/OS performance.

The BIND process within Db2 for z/OS is essentially the method by which application developers and DBA’s access the optimizer for data access path selection. It is not entirely inaccurate to think of BIND and the optimizer as (close to) synonymous. There are a multitude of factors which influence access path selection, including but not limited to the wide variety of BIND options available, numerous ZPARM and configuration settings, and z/OS system resource availability.

This paper discusses at a moderately technical level the Db2 Optimizer, the Db2 BIND/REBIND command(s), and technical issues and configurations closely associated with each topic.

For the remainder of this discussion, the term Db2 will mean Db2 for z/OS (as opposed to Db2 LUW, etc.). This discussion is exclusively for Db2 for z/OS, where the term Db2 is concerned and used. References to z/OS are exclusively for those services and programs executing in a z/OS operating system.

DISCLAIMER: The following discussion is a personal interpretation of the IBM provided documentation on these topics. Because interpretations of technical documentation may vary from person to person, and are therefore subject to scrutiny, the user should always rely on IBM documentation for the definitive and authoritative explanation of each topic.

This paper is as of Db2 12 for z/OS and z/OS 2.2, and was written on Jan. 12, 2018 by:

Kurt Bohnert

Manager, Db2 Systems Programming

Rocket Software, Inc.

The Db2 for z/OS Optimizer is base program code in the Db2 for z/OS DBMS. Its purpose is to convert SQL (often referred to as an SQL string or SQL statement) into an executable object (often referred to as a “prepared statement”). The optimizer is designed to identify the resources available to provide the most efficient access to the data, as requested by the SQL statement being optimized. Those available resources include (but are not limited to) indexes (IX), bufferpools (BP), in-memory storage (STG), the SORTWORK data base (aka DSNDB07), and several other Db2 and system resources which might be leveraged by Db2 for efficient data access.

It should be noted here that all SQL introduced to the Db2 subsystem for data access must first go through the optimization process, to be converted into an executable (aka a prepared statement). There are several ways in which an SQL statement may be passed to the optimizer. Below you will find discussions pertaining to Static SQL versus Dynamic SQL. The distinction between static and dynamic is basically one of determining when an SQL statement is optimized (static SQL is optimized at BIND time while dynamic SQL is optimized at runtime). Also, below are detailed discussions on the Db2 Precompiler (PC) and the Db2 BIND/REBIND command(s), both of which play significant roles in the optimization process. There are also discussions pertaining to a wide variety of other options and configurations within Db2 which play a role in the optimization of an SQL statement.

SQL is introduced into a Db2 subsystem for optimization and execution via a variety of access venues into Db2. Suffice it to say that every SQL statement ultimately is associated with a concept called a PACKAGE. One definition of a Db2 package is as follows:

Packages in Db2 are control-structure objects that contain executable forms of SQL statements or placement holders for executable forms.

A package is a created object, created as the result of a BIND PACKAGE command in Db2. The package contains one or more SQL statements, either pre-optimized into statically prepared SQL statements (i.e. static SQL), or place holders for dynamic SQL statements (i.e. dynamic SQL which is optimized at runtime). The input to the BIND process (input to the BIND command) is a Data Base Request Module (DBRM). A DBRM is created as the result of a Db2 process called a precompile (PC).

Before we discuss Db2 packages, let’s briefly discuss Db2 plans. In earlier versions of Db2, the BIND command produced a PLAN (as opposed to a PACKAGE). In fact, the command was BIND PLAN. Later versions of Db2 added the BIND PACKAGE command. There is a one-to-many relationship between a PLAN and a PACKAGE. A package is always (usually) associated with a plan. A package is always (usually) contained within a plan. A package contains one and only one DBRM. A plan may contain one-to-many packages.

Effective with Db2 V10, BIND PLAN cannot be used to bind a DBRM. To bind a DBRM, you must use BIND PACKAGE. The only purpose of BIND PLAN as of Db2 V10 is to group many related packages into a single plan. This is done with the BIND PLAN … PKLIST command. PKLIST is an option of BIND PLAN which allows you to assign a list of packages (PKLIST stands for package list) to a plan. Essentially, one of the purposes of a package is to provide more granular control by allowing the plan to be subdivided into many different packages. One-to-many SQL statements reside in a PACKAGE. One-to-many packages reside in a PLAN.

For the remainder of this paper, you may think of the terms “plan” and “package” as being synonymous. While there are differences between the two and they are not technically synonymous, there is far more in common between the two than different. This paper is concerned with the way in which SQL statements are optimized via the Db2 Optimizer, and therefore it is primarily concerned with BIND PACKAGE. For a detailed discussion on the differences between BIND PLAN and BIND PACKAGE, and for a discussion on the use of each in Db2, consult IBM Db2 documentation. For the remainder of this paper, BIND will refer to BIND PACKAGE and REBIND will refer to REBIND PACKAGE, unless otherwise specified.

Here is one common high level example of how SQL preparation, optimization, and execution works.

Example: When an application program must access Db2 data, an SQL statement is embedded in the application code (e.g. COBOL, PL/1, ASSEMBLER, etc.). For the sake of this example, we will assume the application program is a COBOL program. The source COBOL program (containing embedded SQL) is used as input to a Db2 supplied program called a precompiler (PC). The PC produces two outputs:

1. The PC produces as output an altered version of the source COBOL program. The PC does this by removing (commenting out) the embedded SQL statements in the source program and replacing them with standard COBOL CALL statements. The COBOL CALL statement is a call to a Db2 attach “stub”, which is essentially a Db2 entry point program. The PC must comment out the embedded SQL and replace it with syntactically correct CALL statements because the embedded SQL statements are not syntactically correct COBOL, and the subsequent COBOL compile will otherwise fail with syntax errors. Note that one input option to the PC is to tell it which programming language is being used (e.g. in this case COBOL), so that it knows the correct CALL statement to insert into the code in place of the SQL statements. The updated/altered source COBOL program produced by the PC is used as input to a standard COBOL compile and LKED (link-edit).

2. The PC also produces a module called a DBRM (Data Base Request Module). This DBRM contains the SQL removed from the source COBOL program. The SQL is stored in UNICODE in a DBRM member, and the member can be placed into a standard z/OS PDS (Partitioned Data Set). This DBRM produced by the PC is used as input to the BIND PACKAGE command in Db2.

The updated source COBOL program is now eligible for standard COBOL compile and link-edit (LKED), producing a standard z/OS load module. The DBRM is used as input to the BIND PACKAGE process, which accesses the Db2 optimizer to produce a Db2 package containing either an executable form of each SQL statement (static), or a place holder for each SQL statement (dynamic).

The last action of the PC is to assign what is called a CONTOKEN (aka consistency token) to both the updated source COBOL program, and the DBRM. While this CONTOKEN can be specified with a PC option, it is often allowed to default to a timestamp. This CONTOKEN ties the program load module (produced by the subsequent COBOL compile and LKED) to the package (produced by the subsequent BIND PACKAGE command executed on the DBRM).

At runtime, the load module is scheduled for execution via normal z/OS and Db2 work scheduling, and the CALL statements placed in the module (by the Db2 PC) direct the program to the executable SQL statements in the package. If the CONTOKEN in the load module does not match the CONTOKEN in the PACKAGE, Db2 will return a mismatch error to the program. If a mismatch occurs, one may compare the CONTOKEN of the package as it is stored in SYSIBM.SYSPACKAGE (Db2 metadata), to the load module. Note that the CONTOKEN is stored in SYSIBM.SYSPACKAGE exactly as it is stored in the DBRM used as input to the BIND PACKAGE command. The CONTOKEN in the load module however may be stored differently, depending on the source programming language employed. Actual examples follow:

For C programming language:

SELECT CONTOKEN FROM SYSIBM.SYSPACKAGE WHERE NAME = ‘dbrm-name’;

--------

.Óñ..·.]

1E400B2B

9E90D3DB

--------

The DBRM CONTOKEN (when viewing the DBRM in HEX):

--------

Óñ · ]

1E400B2B

9E90D3DB

--------

The load module CONTOKEN (when viewing the load module in HEX):

--------------------------

.Ó .N>x.ñ. .Nø ..· .NÊx..]

1E40D6A04040D7400B40D7A02B

9E005E7890005010D3005278DB

--------------------------

For COBOL programming language:

SELECT CONTOKEN FROM SYSIBM.SYSPACKAGE WHERE NAME = ‘dbrm-name’;

--------

.Óñ..·.]

14E10026

9DDA9A22

--------

The DBRM CONTOKEN (when viewing the DBRM in HEX):

----------------

.(Ò....Â

14E10026

9DDA9A22

----------------

The load module CONTOKEN (when viewing the load module in HEX):

--------

...Â.(Ò.

002614E1

9A229DDA

--------

For PL/1 programming language:

SELECT CONTOKEN FROM SYSIBM.SYSPACKAGE WHERE NAME = ‘dbrm-name’;

--------

.g.p.oÚ.

181919F1

97F7E6EB

--------

The DBRM CONTOKEN (when viewing the DBRM in HEX):

--------

.g.p.oÚ.

181919F1

97F7E6EB

--------

The load module CONTOKEN (when viewing the load module in HEX):

--------

.g.p.oÚ.

181919F1

97F7E6EB

--------

For Assembler programming language:

SELECT CONTOKEN FROM SYSIBM.SYSPACKAGE WHERE NAME = ‘dbrm-name’;

--------

.v.Ó.)Cy

1A2E15CA

95FEED38

--------

The DBRM CONTOKEN (when viewing the DBRM in HEX):

--------

.v.Ó.)Cy

1A2E15CA

95FEED38

--------

The load module CONTOKEN (when viewing the load module in HEX):

--------

.v.Ó.)Cy

1A2E15CA

95FEED38

--------

Let’s perform a more detailed analysis of the BIND PACKAGE command. BIND PACKAGE builds an application package. Db2 records the description of the package in several Db2 catalog tables (e.g. SYSIBM.SYSPACKAGE) and saves the prepared package in the Db2 Directory Skeleton Package Table (SKPT). The input to BIND PACKAGE is a DBRM, which was one output from a PC (above). The DBRM contains the SQL statements stripped out of the application program by the PC and placed into the DBRM. When BIND PACKAGE is executed, it either invokes the Db2 Optimizer to optimize each static SQL statement into an executable form in the package, or it puts a place-holder in the package for each dynamic SQL statement in the DBRM.

Db2 can distinguish between a static SQL statement which is optimized at BIND time, and a dynamic SQL statement which is optimized at runtime. Without going into a great deal of detail, a dynamic SQL statement is essentially the result of two SQL statements called a PREPARE and EXECUTE. The two SQL statements work together. Another option is to tell Db2 to manage them as one statement by coding an EXECUTE IMMEDIATE statement in place of the PREPARE and EXECUTE statements. Since this paper is not intended to be a discussion on application programming, all references to static versus dynamic SQL will adhere to these definitions. For more on these SQL statements, see the IBM Db2 for z/OS Application Programming and Reference Guide (and applicable IBM Db2 doc).

A BIND PACKAGE command takes the following form:

BIND PACKAGE(coll-id) MEMBER(dbrm-name) … many options, some discussed below

A PACKAGE (PKG) is bound from exactly one DBRM. Package names take the following form: LOCATION.COLLID.DBRM-NAME.CONTOKEN.

o LOCATION is always blank for locally bound packages.

o COLLID is specified in the BIND command and discussed immediately below.

o DBRM-NAME is specified in the BIND command.

o CONTOKEN is the CONTOKEN created in the PC.

A COLLID (coll-id) is a collection of packages. There are several ways to create a COLLID, including the following:

o GRANT PACKADM ON coll-id … ;

o GRANT CREATE ON coll-id … ;

o Issue a BIND command using a new coll-id.

When a COLLID is created via one of the above methods, a row is inserted into the Db2 catalog table SYSIBM.SYSRESAUTH. This table tracks the ability of users to use various “resources”, such as COLLID’s. When one binds a DBRM into a PKG, one must specify a COLLID in the BIND PACKAGE command, and one must have the authority to BIND into that COLLID. If a COLLID which does not yet exist is specified in the BIND PACKAGE command, the AUTH-ID issuing the BIND command must have the Db2 authority to create a new COLLID. See IBM Db2 doc for more on authorities required to perform these functions.

There are many options which can be specified in the BIND PACKAGE command. In most cases, if the option is not specified, there is a default value for that option (e.g. in some cases a ZPARM is the default). For a complete list of BIND PACKAGE options (and defaults), and greater detail on each option, see the IBM Db2 Command Reference Guide (and other IBM Db2 doc). A few sample options are discussed below.

BIND PACKAGE … ACTION(REPLACE|ADD) …

When one binds a DBRM into a PKG, one may choose (via the ACTION option) to either REPLACE an existing PKG or ADD a new PKG. REPLACE is the default when ACTION not specified. If ACTION(REPLACE) is employed and the package does not yet exist, a new package will be created as if ACTION(ADD) was specified.

Note the distinction between the BIND ACTION(REPLACE) command and the REBIND command (which is discussed in greater detail below):

o BIND w/ REPLACE is used if the SQL changed and a new DBRM was created (via a new PC) due to the SQL change. The BIND uses the new DBRM to replace the old package with a new package (containing the SQL changes).

o REBIND is used when no SQL changed, but one wishes the optimizer to reconsider the static SQL access paths previously chosen with the prior BIND. An access path might improve as the result of a new index or larger BP’s (or for a variety of other reasons), and a REBIND can be employed (using the same DBRM) to recreate (REBIND) the PKG.

BIND PACKAGE … VALIDATE(RUN|BIND) …

When a PKG is bound, the Db2 optimizer always validates the existence of all Db2 objects referenced in the DBRM. If a referenced object does not exist at BIND time, the VALIDATE option determines if the BIND fails or simply throws a warning message and continues. If VALIDATE(BIND) is employed in the BIND command, the BIND will fail if a referenced object does not exist. If the default option VALIDATE(RUN) is employed, then the BIND will not fail due to the missing object, it will simply provide a warning message. In IBM Db2 documentation, there is a concept known as an Incremental BIND. An incremental BIND occurs when one uses the default option VALIDATE(RUN). If VALIDATE(RUN) is employed at BIND time, then at runtime Db2 will again validate the existence of the referenced objects before execution. This is called an incremental BIND. If a referenced object does not exist at runtime, the execution will fail.

BIND PACKAGE … EXPLAIN(NO|YES|ONLY) …

The BIND PACKAGE command includes the EXPLAIN option. The value YES tells Db2 to populate a variety of EXPLAIN tables with access path information, when it adds/replaces the new package. The data placed in these tables effectively explains the access path chosen by the optimizer for each SQL statement optimized in the BIND. NO is the default, and tells Db2 to add/replace the designated package, but to not populate these tables with access path information. ONLY tells Db2 to only populate these tables, and to not create/replace a PKG.

While beyond the scope of this paper, it should be noted that output from the EXPLAIN option in the BIND command is placed into several tables documented in IBM Db2 doc, including but not limited to:

o userid.PLAN_TABLE

o userid.DSN_FUNCTION_TABLE

o userid.DSN_STATEMENT_TABLE

o userid.DSN_STATEMENT_CACHE_TABLE

o Many others … see the discussion on DSNTESC below for details.

The SDSNSAMP member named DSNTESC contains DDL which can be copied and used to create EXPLAIN tables (by individual ID). Db2 V11 provides for the creation of twenty-two (22) different EXPLAIN tables. These are the tables populated with access path information when EXPLAIN is employed.

Note that there are several ways to request that an EXPLAIN be performed for one or more SQL statements. One other example is the EXPLAIN SQL statement. In each case, several (or all) of the EXPLAIN tables are populated with access path information for the SQL statement(s) in question. See the IBM Db2 doc for more on EXPLAIN. See the Db2 Administration Guide for an explanation of the data captured in these tables, and how to interpret that data.

BIND PACKAGE … SQLERROR(NOPACKAGE|CONTINUE|CHECK) …

Another BIND PACKAGE option worth noting is the SQLERROR option. NOPACKAGE is the default, and tells Db2 that no PKG is to be created if an SQL error is encountered in any SQL statement being optimized. CONTINUE tells Db2 to create a package even if errors occur when binding SQL statements. If the cause of the error persists at runtime however, the statements in error cannot execute. Any attempt to execute them at runtime results in an error. CHECK tells Db2 to only check the SQL for syntax and semantics, but no PKG is to be created or replaced.

BIND PACKAGE … KEEPDYNAMIC(NO|YES) …

This BIND option dictates if prepared (i.e. optimized) dynamic SQL statements should be cached for reuse in local thread storage at runtime. This should not be confused with the ZPARM CACHEDYN=YES|NO, which dictates if prepared (i.e. optimized) dynamic SQL statements should be cached for reuse in the EDMPool Global Dynamic SQL Statement Cache.

KEEPDYNAMIC(YES) indicates that prepared DYN SQL statements are “cached” in local thread storage past COMMIT. KEEPDYNAMIC(NO) indicates that DYN SQL is not cached in local thread storage past COMMIT. NO is the default. Note that KEEPDYNAMIC(YES) is mutually exclusive with REOPT(ALWAYS). See discussions on REOPT(ALWAYS) and on Cached Dynamic SQL, below.

BIND PACKAGE … QUALIFIER(qualifier-name) …

This Bind option determines the implicit qualifier for unqualified names of various objects (e.g. tables, views, aliases) contained in the package. See the discussion on this option below for more details.

BIND PACKAGE … CURRENTDATA(NO|YES) …

This Bind option enables a feature in Db2 often referred to as Lock Avoidance. Lock Avoidance will allow a thread to avoid taking a lock, and will allow the thread to read through an existing (incompatible) lock to access the data desired. There are specific circumstances which must exist for lock avoidance to take place. To request lock avoidance, this BIND option must be set to NO, and it only applies to statements bound with ISOLATION(CS) … Cursor Stability … for Read-Only (RO) cursors. In such a case, Db2 will perform two tests (one called the PUNC Bit test and one called the CLSN test) to determine if the page/row desired contains an uncommitted update. If either test conclusively determines that the desired page/row does not contain uncommitted updates, then the page/row is retrieved regardless of any incompatible locks on that page/row, and no new lock is placed on the page/row on behalf of this thread. It should be noted that regardless of the setting for CURRENTDATA in the BIND, all singleton SELECT statements bound with ISOLATION(CS) are always eligible for Lock Avoidance. There is a great deal more on this topic in IBM Db2 doc, the details of which are far beyond the scope of this paper.

BIND PACKAGE … DEGREE(1|ANY) …

This BIND option enables query parallelism for static SQL query statements in the package. Parallelism is an action taken by Db2 on behalf of a thread, whereby additional work threads (aka parallel threads) are dynamically spawned to aid in retrieval of requested data. The parallel threads work asynchronously to complete the data retrieval more quickly, thereby reducing the elapsed time to data retrieval and thread completion.

DEGREE(1) tells Db2 that static SQL queries in this BIND are not eligible for parallelism. DEGREE(ANY) tells Db2 that static SQL queries in this BIND are eligible for parallelism. That is not to say that those statements will automatically get parallelism, only that they are made eligible for parallelism if DEGREE(ANY) is specified. Db2 ultimately determines if parallel threads should be spawned for a given workload, and the number of parallel threads that will be spawned. The user has some influence in this Db2 decision, but Db2 ultimately makes all decisions regarding the use of parallel threads.

It should be noted that there also exists a special register in Db2 called CURRENT DEGREE=1|ANY. This special register may be set by the application to enable or disable query parallelism for dynamic SQL queries. The default value for this special register is a ZPARM named CDSSRDEF. This topic, like so many discussed in this paper, is quite complex and beyond the scope of the paper. Refer to IBM Db2 doc for more detail on this topic.

BIND PACKAGE … ISOLATION(CS|RR|RS|UR) …

This Bind option dictates the degree to which the operations of one UOW (Unit-Of-Work) are isolated from the effects of other concurrent update operations. ISOLATION addresses the situation where one UOW wants to read data and another UOW wants to update the same data.

This BIND option dictates the duration that RO (Read Only) page/row locks are held during CURSOR processing. This topic is discussed in detail in another paper in this blog (entitled Db2 for z/OS Locking Overview). It is an extremely complex topic within Db2, and will merely be summarized here. See the paper in this blog, and IBM Db2 doc for more detail on this topic.

Suffice it to say that the ISOLATION level selected at BIND time plays a significant role in determining the duration of held locks, and the subsequent timing in the release of those locks.

The BIND PACKAGE options discussed above are just a few of the options available at bind time. There are many more options included in the BIND PACKAGE command, influencing the BIND and optimizer in a variety of ways. They influence ownership of the package bound, authorization verification for the BIND, SCHEMA assignment for unqualified objects, several locking related options, re-optimization specifications at runtime for SQL containing host variables, optimization hints, specifications for parallelism, package versioning, package stability … the list is very long. See the IBM Db2 Command Reference Guide (and other IBM Db2 doc) for details on all such BIND options.

A topic closely related to, yet different from BIND PACKAGE, is the concept of package rebind. As previously discussed, a package rebind can be used when no SQL has changed in the application, but one wishes the optimizer to reconsider the static SQL access paths previously chosen with the prior BIND. An access path might improve for a variety of reasons when the package is rebound (see notes above for examples). A rebind can be employed (using the same DBRM used for the prior BIND) to recreate the package.

The two primary ways in which Package rebind is performed/invoked are:

1. REBIND command

2. Auto rebind (controlled by the ABIND ZPARM)

The REBIND PACKAGE command simply takes the pre-existing DBRM (the DBRM used for the prior BIND PACKAGE) and re-optimizes the access paths for all static SQL found in the DBRM. Note that this DBRM exists in a PDS (Partitioned Data Set … see prior notes on this). The PDS in which the DBRM resides is noted in the Db2 catalog metadata for the package, in a SYSIBM.SYSPACKAGE column named PDSNAME (this is only true for locally bound packages … for packages bound from a distributed source, this column contains a product signature and some form of remote location specification). As a result, for a locally bound package, the REBIND command will acquire the DBRM from the PDSNAME value (data set name) and the package identifier value in the REBIND command (to identify the member in the PDS), and perform the rebind.

Auto rebind occurs at package execution when a locally bound package has been previously marked “invalid” by Db2, and the ZPARM ABIND is set to YES. There are certain events which result in Db2 marking an existing package as “invalid”. When a package is marked invalid, it is noted in the metadata for the package in a SYSIBM.SYSPACKAGE column named VALID. While there are several events which can drive Db2 to mark a package invalid, two examples follow:

o When one adds a new DATE, TIME, or TIMESTAMP column to a table, packages accessing that table are marked “invalid”.

o When an object (e.g. TABLE, INDEX, etc.) is dropped, packages referencing that object are marked “invalid”.

o There are a variety of other ways in which a package could be marked “invalid”. See IBM Db2 doc for details.

If a package marked “invalid” is executed, Db2 automatically attempts to rebind (aka “auto rebind”) the package, but only if the ZPARM ABIND=YES is set (if ABIND=NO, the execution of the package fails). If auto rebind is attempted and succeeds, the package is executed normally. If auto rebind is attempted and fails, the package is marked “inoperative” via the OPERATIVE column of the catalog table SYSIBM.SYSPACKAGE. In such a case, the only option is to resolve the reason for the rebind failure and attempt it again, or BIND from a new DBRM.

Having discussed the ways in which a package can be created and recreated, let’s now discuss the ways in which a package can be removed/deleted. A package can be deleted via either the FREE PACKAGE command or the DROP PACKAGE SQL statement. While there are technical differences between the two, detailed in IBM Db2 doc, they effectively accomplish the same task. That is to delete an existing package from a Db2 subsystem. For the remainder of this paper, the FREE PACKAGE command will be used in favor of the DROP PACKAGE SQL statement.

FREE PACKAGE is a command which allows for the removal (i.e. deletion) of an existing package. One may choose to free a single package or several packages using this command. The FREE PACKAGE command takes the form of:

FREE PACKAGE (location-id.coll-id.package-id.(version-id)) … more options.

Note that the location-id is always blank and should be omitted in the command. The coll-id is the collection ID of the package and the package-id is the DBRM name of the package, as used in the BIND PACKAGE command which originally bound the package.

The coll-id and package-id values are the only two required values in the command. The coll-id, package-id, and version-id values can be specifically named (i.e. discreet) or one may specify an asterisk (*) as a wildcard character, which tells Db2 to free all packages which qualify based on the other names provided in the command and certain rules documented in Db2 doc.

Let’s now discuss “package versioning”. As previously discussed, a package name takes the form of … LOCATION.COLLID.DBRM-NAME.CONTOKEN. The FREE PACKAGE command however, does not employ the CONTOKEN when identifying the package to be freed (see above). If two packages are named with the identical COLLID and DBRM-NAME, then Db2 must provide a way to distinguish between the two for certain commands (most notably for FREE PACKAGE).

Db2 allows one to specify a version-id during precompile (PC). In the previous discussion on PC, version-id was not mentioned. It is discussed here, and in detail below. The sole purpose of a package version-id is to provide a handle for commands such as FREE PACKAGE (and some others such as the REBIND command). If a version-id is not specified at PC, then it defaults to an “empty string”. If a version-id is specified at PC, it can be used in commands such as FREE PACKAGE to help distinguish among like named packages (i.e. when more than one package exists with the same COLLID.DBRM-NAME).

In the case where more than one package exists with the same COLLID.DBRM-NAME, one may direct the FREE PACKAGE command to free one specific package containing that COLLID.DBRM-NAME by specifying the version-id. If one specifies () in the command for version-id, an empty string is used for the version ID. If one omits the version-id in the command, the default value depends on how one specifies package-id. If one uses ‘*’ for package-id then version-id defaults to ‘*’. If one provides an explicit value for package-id then version-id defaults to an empty string.

In the case where package versioning is used (i.e. a version-id is specified in PC), two packages using two different DBRM’s can be bound with the same COLLID and DBRM-NAME. They will each be assigned a different CONTOKEN at PC, for synchronization at runtime. They will each also be assigned a different version-id at PC, to be used as a handle for commands.

One example where this might take place is when a Db2 tool (product) has two generally available (GA) product versions (e.g. V1 and V2). The DBRM’s for V1 are different from the DBRM’s for V2, but both will be bound with the same COLLID and DBRM-NAME. They are distinguished at runtime by the unique CONTOKEN associated with each package/load module, and they are distinguished in the FREE PACKAGE command by the version-id.

A common mistake made by Db2 users in the interpretation of this topic is that CONTOKEN and VERSION are the same. They are not the same, and they exist for different reasons.

CONTOKEN is stored in the SYSIBM.SYSPACKAGE.CONTOKEN column and is used at runtime to assure that the correct package is executed for the calling load module. They are synchronized by their common CONTOKEN. CONTOKEN means “consistency token”, and assures consistency between the calling load module and the package being executed.

CONTOKEN is assigned at PC using the LEVEL(…) option, as follows:

o If PC LEVEL(…) is used, then that is the CONTOKEN assigned to the resulting DBRM and updated source program produced by the PC.

o If PC LEVEL option is not used, then the CONTOKEN assigned to the resulting DBRM and updated source program is the PC TIMESTAMP.

As previously stated, this CONTOKEN is embedded (by the PC) in both the DBRM and the updated program source deck. The DBRM is bound into a PKG. The updated source deck is compiled and LKED into a load module. At runtime, this CONTOKEN is used to correctly match the load module with its PKG. At runtime, the correct PKG is automatically chosen by Db2 based on the CONTOKEN of the load module and the CONTOKEN of the PKG.

VERSION is stored in the SYSIBM.SYSPACKAGE.VERSION column and is used as a handle for certain commands (most notably in this discussion for FREE PACKAGE) to identify a specific package when there exists more than one package with the same COLLID and DBRM-NAME.

VERSION is assigned at PC using the VERSION(xxxxxx|AUTO) option, as follows:

o If no VERSION option is used at PC, it is by default an empty string ().

o If VERSION(AUTO) is used at PC, the VERSION is assigned the value assigned to the CONTOKEN.

o If VERSION(xxxxxx) is used at PC then xxxxxx is the VERSION assigned.

So, VERSION is not the same as CONTOKEN, although the values will be the same if the PC VERSION(AUTO) option is used.

Having discussed the various components of package names and some of the parameters in a FREE PACKAGE command, let’s look at some FREE PACKAGE examples. Note that in all cases, one must have the appropriate Db2 authorities for each package to be freed, or the package will not be freed. Only those packages for which the command user possesses the Db2 authority to FREE the package will the package be successfully freed.

FREE PACKAGE (COLLID1.DBRM1) … the version-id will default to an empty string and the command will free any package in the collection ID named COLLID1, with a DBRM name of DBRM1, and with an empty string () for a version-id.

FREE PACKAGE (COLLID1.*) … the version-id will default to ‘*’ and the command will free all packages in the collection ID named COLLID1.

FREE PACKAGE (*.DBRM1) … the version-id will default to an empty string and the command will free any package in any coll-id in which the DBRM name is DBRM1, and with an empty string () for a version-id.

FREE PACKAGE (COLLID1.DBRM1.(VERID1) … will free the package in the collection ID COLLID1, with DBRM name DBRM1, and tagged with a version-id (at PC) of VERID1.

FREE PACKAGE (*.*) … will/could free all packages in the Db2 subsystem … NEVER EVER DO THIS!!!

Next, let’s discuss a few (but certainly not all) non-BIND PACKAGE related factors which influence SQL optimization and access path selection. The optimization of SQL statements performed by the Db2 optimizer is an extremely complex process. There are several (many) factors which play a role in the access path selected for each SQL statement being optimized. Below is a sampling of some of the non-BIND command factors, discussed here to merely provide a flavor for how the optimizer works.

When the optimizer selects an access path for an SQL statement, one of the factors on which it relies is called the “filter factor” of the predicate. The filter factor is a number between 0 and 1 (e.g. .5) and represents the percent of rows for which the SQL predicate is (likely) true (e.g. .5 = 50% of the rows qualify for this predicate).

In many cases, Db2 will not be able to determine the exact filter factor for a predicate, and will employ an “estimated filter factor”. In such cases, Db2 will rely on data such as RUNSTATS to help determine an estimated filter factor for the predicate, and thereby select the best access path.

When host variables exist in a predicate, there may be no way at BIND time for Db2 to estimate the filter factor for the predicate, because Db2 will not know the value of the host variable at bind time. In such cases, Db2 will employ a default filter factor. Unfortunately, this default filter factor may not allow for selection of the optimum access path.

When host variables exist in a predicate, the BIND option REOPT can be employed to tell Db2 to re-optimize the SQL statement each time it is executed (at runtime), using the now known value of the host variable. Because the value of the host variable is known at runtime, Db2 can identify a more accurate filter factor for choosing an access path. When REOPT(ALWAYS) is specified, static SQL statements with host variables, parameter markers, and those reliant on special registers are always re-optimized at runtime. This enables the optimizer to know the value of the host variable during optimization, providing a more accurate filter factor.

It should be noted that REOPT(ALWAYS) is but one of several selections allowed in the BIND REOPT option. The entire topic is quite complex, and beyond the scope of this paper. See IBM Db2 doc for more detail option.

Another factor in access path selection is a concept known as the CLUSTERRATIO. This is the percentage of rows in the object which are in true (correct) physical order (as per the clustering index). There is a column in the Db2 catalog table SYSIBM.SYSINDEXES named CLUSTERRATIO. This column is populated every time RUNSTATS are collected for the table in question. Note that there are several ways to collect RUNSTATS, and the collection (and accuracy) of RUNSTATS usually plays a significant role in access path selection. CLUSTERRATIO is one such statistic collected with RUNSTATS. CLUSTERRATIO is usually only important for sequential access, and is not frequently used by the optimizer for random access queries.

Optimization Hints allow one to tell the Db2 optimizer to use an old access path stored in a PLAN_TABLE (stored there as the result of a prior EXPLAIN, discussed above) when selecting an access path. This is the equivalent of telling the optimizer which access path is preferred, using rows in an existing PLAN_TABLE. There is a BIND option called OPTHINT and a ZPARM called OPTHINTS, which both play a role in providing optimization hints.

A comment should be made in this paper regarding a Db2 feature called Access Path Stability (aka APS, aka Plan Stability, there are several names for this feature … this paper will use APS). APS provides the ability to retain old access paths (i.e. old packages) during REBIND PACKAGE (it applies to REBIND only, not to BIND). When a REBIND is executed, the old package is replaced with the new package, and the old package is deleted in favor of the new package. Based on a variety of factors, it can happen that the new package was optimized with less desirable access paths than the access paths contained in the old package. In such a case, it is not possible to revert to the old package (the more desirable access paths) unless the old package is saved (rather than deleted) when the package was replaced.

The REBIND PACKAGE option PLANMGMT gives one the ability to direct Db2 to retain (i.e. save) old access paths (i.e. old packages), in the event a prior access path is determined to be more desirable. In such a case, Db2 provides the user the ability to revert to that saved package. The entire topic pertaining to APS is quite complex, and encompasses a variety of concepts which are beyond the scope of this paper. Simply be aware that Db2 provides for the ability to retain old access paths during REBIND, and the ability to revert to those old access paths if desired. The IBM Db2 doc contains a great deal of information on this topic.

Let’s now discuss a related topic in which a package may be selected for use at runtime via a special register called CURRENT PACKAGESET. The same DBRM (containing the same CONTOKEN) can be bound into packages residing in different COLLID’s. At runtime, the application may set the special register CURRENT PACKAGESET to identify which of the packages should be employed by Db2 for that execution of the package. An example of this follows, and will help to illustrate this feature.

Example: A PC created a DBRM named DBRMA. It has a CONTOKEN which is identical to the CONTOKEN placed into the updated source program by the PC. The updated source program is compiled and LKED, and the load module contains the CONTOKEN. The DBRM named DBRMA is used in two BIND operations, to create a package in two different COLLID’s, named COLLID1 and COLLID2. Here are the BIND PACKAGE commands to do this:

BIND PACKAGE(COLLID1) -

QUALIFIER(OWNER1) -

MEMBER(DBRMA) . . . etc. (more BIND options)

BIND PACKAGE(COLLID2) -

QUALIFIED(OWNER2) -

MEMBER(DBRMA) . . . etc. (more BIND options)

Note that each BIND specifies a different QUALIFIER value. The result of the two BIND commands is two packages located one each in COLLID1 and COLLID2. So, the two packages are named:

LOCATION.COLLID1.DBRMA.CONTOKEN

LOCATION.COLLID2.DBRMA.CONTOKEN

The LOCATION is always blank, and the CONTOKEN value is the same for each package. Both DBRM’s reference a table named TABLEX, which is not qualified with a CREATOR (SCHEMA) value … e.g. … SELECT * FROM TABLEX;

If at runtime, the application wishes to access OWNER1.TABLEX, it issues an SQL statement as follows:

SET CURRENT PACKAGESET = ‘COLLID1’;

The SELECT * FROM TABLEX statement will be directed by Db2 to OWNER1.TABLEX.

If at runtime, the application wishes to access OWNER2.TABLEX, it issues an SQL statement as follows:

SET CURRENT PACKAGESET = ‘COLLID2’;

The SELECT * FROM TABLEX statement will be directed by Db2 to OWNER2.TABLEX.

Depending on the value of CURRENT PACKAGESET, the application can direct the SELECT statement to access either table (or both, by toggling back and forth using the SET CURRENT PACKAGESET statement).

There is also a related special register named CURRENT PACKAGE PATH. It essentially performs the same function as CURRENT PACKAGESET, except that CURRENT PACKAGE PATH can be set to >1 COLLID … e.g. …

SET CURRENT PACKAGE PATH = ‘coll-id1, coll-id2, coll-id3, … ’;

The first COLLID in the path which contains the package is used by Db2 for that execution of the package/statement. CURRENT PACKAGE PATH is always checked before CURRENT PACKAGESET. If CURRENT PACKAGE PATH is blank (i.e. not set), then CURRENT PACKAGESET is used. If CURRENT PACKAGESET is blank, the order of the PLAN PKLIST at BIND PLAN time is used. There is a great deal of doc on the use of these two special registers in Db2 documentation.

Let’s briefly discuss a feature of Db2 in which prepared (optimized and executable) dynamic SQL statements are cached for future reuse. Recall (from the above discussion) that SQL statements can be categorized as either static or dynamic. Static SQL statements are optimized into prepared (executable) statements at BIND time. Dynamic SQL statements are optimized into prepared (executable) statements at runtime. The statically prepared SQL statements are stored in the package, and need not be re-optimized with each successive execution. The dynamically prepared SQL statements are executed and then discarded by Db2, unless Db2 is configured to cache (save) them for future reuse. If they are not cached for future reuse, each execution of a dynamic SQL statement must be re-optimized. For applications employing a high volume of dynamic SQL activity, this can become very expensive and time consuming. It takes time and costs money (in terms of CPU time, etc.) to re-optimize dynamic SQL statements with each execution.

An alternative is Dynamic SQL Statement Caching. Dynamic SQL Statement Caching is a feature within Db2 which allows for optimized (prepared) dynamic SQL statements to be saved for reuse. When such a prepared statement is cached and subsequently re-executed, the cost and time required to re-optimize the same dynamic SQL statement is saved. The cached statement is re-executed instead. Db2 can be configured to cache (save) prepared dynamic SQL statements in two ways. The statements can be cached “globally”, or they can be cached “locally”.

Global Dynamic SQL Caching is enabled at the Db2 subsystem level, for all dynamic SQL executed in that subsystem. The prepared statements are stored in a global cache called the EDM Statement Pool. This pool is storage above the 2GB Bar in the DBM1 address space. The size of the Global Statement Cache is controlled by certain configuration settings, most notably the ZPARM EDMSTMTC. Global Dynamic SQL Caching is enabled with the ZPARM CACHEDYN=YES|NO. When this ZPARM is set to YES, the value specified for the EDMSTMTC ZPARM becomes the minimum number of KB set aside in virtual storage in the DBM1 address space, to be used for the caching of all/most prepared dynamic SQL statements. While there are certain BIND settings (e.g. the REOPT=ALWAYS setting discussed above) which will disable dynamic SQL caching for some statements, any prepared dynamic SQL statement which is not excluded from global caching will be stored in this EDM Statement Cache. The cached statements are then available to Db2 for reuse, without the need for re-optimization. If the cache becomes full, it can be increased to a certain maximum size. If that maximum size is approached, Db2 may expel statements from the cache on a Least Recently Used (LRU) basis, to make room for more frequently used statements.

Local Dynamic SQL Statement Caching provides a way to cache prepared dynamic SQL statements in a more granular manner, by caching them at the package level in local thread storage. Every thread in Db2 is allocated storage in the DBM1 address space. Most of that storage is above the 2GB Bar in DBM1, but some must also reside below the bar. Recall that a Db2 thread executes a package. When the package is bound, the BIND PACKAGE option KEEPDYNAMIC(YES|NO) dictates whether prepared dynamic SQL statements executed from that package are to be stored in local thread storage for that thread, or not. If KEEPDYNAMIC is set to NO (the default), Db2 optimizes each dynamic SQL statement to be executed into a prepared statement. Db2 then executes the prepared statement and discards it when done with it. If that same dynamic SQL statement is re-executed, Db2 must again optimize it into a prepared statement before execution. If KEEPDYNAMIC is set to YES, Db2 will cache (save) each prepared dynamic SQL statement after the first optimization of each statement, and reuse it if the statement is re-executed. This saves the application (thread) the cost of re-optimization of the statement, in exchange for the price of the storage to cache the prepared statement in local thread storage for reuse.

Note that there is nothing to restrict Db2 from caching prepared dynamic SQL statements both globally and locally. The two methods are not mutually exclusive, and one may choose to cache prepared statements in both the global and local caches. While this is not a recommended configuration, and is likely not done frequently, it can be done.

Finally, it should be noted here again that the complex discussions associated with the caching of dynamic SQL are closely related to equally complex discussions on the BIND PACKAGE option REOPT (see above). Both topics interact with and impact each other, and the details are beyond the scope of this discussion. Suffice it to say that these topics are documented in detail in a variety of IBM Db2 doc.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Related searches