Oracle Secure Enterprise Search - Implementing search ...



Implementing Enterprise Manager Monitoring with SES - a case study of search.

An Oracle Technical White Paper

November 2006

[pic]

EXECUTIVE SUMMARY 3

Overview of Secure Enterprise Search Statistics 4

Using EM to display and provide SES Statistics 6

Crawler status monitoring 6

Average Query Time 7

EM Alerts and Notifications for SES Events 9

EM Report Configuration for SES statistics 9

Overview 9

Ad hoc EM Reports for search. 9

Scheduled EM Reports for search. 11

Appendix A – SQL statements for EM UDMs 12

Average Query Time 12

95th percentile query time 12

Maximum query time 12

Number of searches performed 12

Number of searches resulting in no hits 12

Percentile at which target query time is reached 12

Display Crawler Status 12

Top 50 most popular searches by number of hits for previous week 12

Top 50 most popular searches by number of reqs for previous week 12

Appendix B – SQL statements for EM Reports 13

SQL for ad hoc report – 95th percentile and average query time 13

SQL for Scheduled EM Report – 1 day, 7 day and 30 day data 13

EXECUTIVE SUMMARY

Oracle Secure Enterprise Search 10g 10.1.6 (SES) is an out-of-the-box solution that provides search capabilities across multiple repositories - Oracle databases and portals, websites, email systems, files on disk and many more.

Oracle SES was configured to provide search services for (WWW) and otn. (OTN). This service can be accessed via the URL where Oracle’s content such as documentation, blogs and forums can be searched. To read about how SES for search. was implemented, go to

Once search. was implemented, Oracle Enterprise Manager, Grid Control (EM) was used to create performance and availability monitoring and reporting. The objectives of this exercise included:

• To create real-time monitoring and alert notification methods

• To provide consistency and to enable one central monitoring application to be used for all systems and applications managed within the group

• To provide a long-term history of the performance of the site, its response time and its overall effectiveness

• Expand the population of users who can view SES Statistic without needing SES Admin privileges

This document aims to provide the audience with the basic knowledge required to create and configure EM user-defined metrics, notification rules and reports to enhance and complement the built-in utilities that are supplied with the Secure Enterprise Search appliance. This document assumes that the reader has already installed and configured the EM agent on the SES server, and if the EM console is on a separate server (Grid Control), then that EM server name/IP address has been added to the list of tcp.invited_nodes in the sqlnet.ora file on the SES server (see Section 5.3 in the SES Installation whitepaper for more details).

Any database object name such as tables, columns etc are subject to change and the SQL statements in this document are not supported or guaranteed to work in future versions of the product.

Overview of Secure Enterprise Search Statistics

Every time a user performs a search, an entry is made by the application into the EQ_TEST.EQ$STATISTIC table. The entry includes details of keywords entered, number of hits and the length of time that the application took to return the results to the user

.

The amount of time that data is kept in the EQ$STATISTIC table is determined by the setting of “Query Logging Period (Days)” in the SES Admin UI tool. The parameter can be found by navigating to the “Global Settings”, “Query Configuration” page, in the “Query Statistics” section. The administrator is also able to disable statistics collection in this section if desired.

[pic]

A number of database jobs are created as part of the SES installation. These jobs perform various calculations of the data held in the EQ$STATISTIC table and use the results to populate a further set of statistic tables. These results tables are accessed via the Admin UI when the Administrator displays the “General” page or any of the “Statistics” sub-pages.

[pic]

This method of generating the statistics means that access to the results via the Admin UI is fast as the data is being retrieved directly from one table with no further manipulation being necessary. The downside can be that the data shown is not real-time. The age of the data is dependant on when the relevant database job was last run to populate the results tables. The frequency of the jobs can be modified by following the instructions in metalink Note #363923.1

Details of how the SES jobs are configured can be viewed in the EQSYS.EQ$_JOB_INFO table using a query such as:

SELECT

JI_ID, JI_TYPE, JI_JOB_NO, JI_INTERVAL,

to_char(JI_PROCESS_START,'MM/DD HH24:MI:SS') as Start_Date

from EQSYS.EQ$_JOB_INFO;

JI_ID JI_TYPE JI_JOB_NO JI_INTERVAL START_DATE

---------- ------------ ---------------- --------------------- --------------

1 1 8 HH0001

2 4 9 DD0101

3 3 10 DD0104 10/10 04:00:07

4 6 11 DD0101 10/10 01:00:02

5 7 12 ONDEMAND

6 8 13 ONDEMAND 09/19 01:47:56

7 9 14 DD0103 10/10 03:00:02

The JI_JOB_NO value corresponds to the JOB value in DBA_JOBS where similar information can also be viewed. The JI_ID value corresponds to the second parameter used in the “eq_job.invoke(1,n)” statement in the DBA_JOBS WHAT entry, and is also required as the first parameter in the “eq_job.change_job” procedure as mentioned in Note #363923.1.

JI_ID and JI_JOB_NO can be different across different SES installations but JI_TYPE always remains consistent and refers to the type of statistics calculations and results that are generated.

For example:

JI_TYPE 3 is the index optimization job. JI_TYPE 9 is a job that calculates most

of the general summary statistics and Query Statistics. JI_TYPE 4 generates the results shown on the “Daily Statistics” page.

Using EM to display and provide SES Statistics

User-defined Metrics (UDMs) can be created and leveraged in EM against the SES database target, to enhance and complement the built-in performance reporting, monitoring and notifications of EM and SES. The following sections provide details of the UDMs that have been created against the search. back-end databases. The SQL statements are listed in Appendix A.

Crawler status monitoring

Although this information is available in the SES Admin UI, it was felt useful to also be able to retrieve it directly in EM. One advantage is that SES Admin privileges are not required, only EM access. This UDM enables the EM user to see, at a glance, what Crawls are configured, and what their status is. The thresholds have been set so that a warning email alert is generated if a crawl is manually stopped by an SES Administrator. A critical email alert is generated if a crawl fails.

The UDM is configured as a “Two Column” String metric – the SQL query is listed in Appendix A.

[pic]

The output from this UDM is displayed in this format:

[pic]

Average Query Time

This UDM provides an average query time based on all queries executed in the previous 10 minutes. It is a single-value number.

[pic]

As it is a single-value metric, the historical values can be viewed in graph form:

[pic]

Other single-value UDMs

▪ 95th percentile query time

▪ Maximum query time

▪ number of searches performed

▪ number of searches resulting in no hits

▪ percentile at which target query time is obtained

Other two-column UDMs

▪ top 50 most popular searches by number of hits for previous week

▪ top 50 most popular searches by number of requests for previous week

[pic]

EM Alerts and Notifications for SES Events

Thresholds can be used to trigger alerts/notifications, if for example, the average query time becomes very slow. The following steps need to be performed to implement the notification:

▪ Critical and/or warning thresholds must be set on the UDM and be meaningful

▪ An appropriate notification method must be created/defined in advance

▪ Create a notification rule using Target Type “Database Instance” and ensure the SES database is included in the target list

▪ Metric name/type in the Notification Rule needs to include one or more of the “User-Defined xxx Metric” options

▪ An appropriate notification method should be assigned to the rule

EM Report Configuration for SES statistics

Overview

Each time a user-defined metric runs, the results are inserted into the EM repository. EM reports cannot run SQL statements against tables outside of the EM repository database so instead, we have to configure the reports to utilize the data created by the UDM.

The main views accessed are MGMT$METRIC_CURRENT which holds the latest results from the UDMs; MGMT$METRIC_HOURLY which is the data from the MGMT$METRIC_CURRENT table grouped into hourly periods and with the minimum, maximum and average values calculated for each hourly period; and MGMT$METRIC_DAILY which is similar to the MGMT$METRIC_HOURLY view but grouped into 1 day periods, instead of 1 hour.

Ad hoc EM Reports for search.

On-Demand Report to see average and 95th percentile query times for a customizable time period, broken down by day

Steps followed to create the report:

▪ In the Report page in EM and click on the “Create” link.

▪ In the General section, enter a name for the report. Select the “Use the Specified Target” option and enter the name of the SES database.

▪ Set an initial Time Period, and select the “Allow the report viewer to customize the time period” option

[pic]

▪ In the Elements section, add a “Table from SQL” type element and enter a SQL statement in the parameter settings, based on example in Appendix B.

▪ Click on “Preview” to validate the returned data

[pic]

Once the EM end-user clicks on the report name, the report is generated using the initial time period set in the report definition. The user can then click on the “Set Time Period” button to change the output as required.

Scheduled EM Reports for search.

A report can be created and scheduled to run on a regular basis. Copies of the report can optionally be emailed and/or saved for a limited period (determined by the settings in the “Schedule” section).

[pic]

The SQL used to generate a report that shows average and maximum performance figures (based on UDMs) for the previous 1 day, 7 days and 30 days, for the specified SES databases, can be found in Appendix B

The output when viewed through EM can be ordered by any column, by clicking on the column header:

[pic]

Appendix A – SQL statements for EM UDMs

(The time period covered in the following sql statements can be adjusted by modifying the ‘where qdate’ clause – eg ‘qdate>=sysdate-60/1440’ will retrieve results for the previous hour. The examples below all cover a 10-minute period)

Single Value Metrics

Average Query Time

select trunc(avg(qtime),4) from eq_test.eq$statistic where qdate>=sysdate-10/1440

95th percentile query time

select percentile_disc(0.95) within group (order by qtime asc) as perc_disc from eq_test.eq$statistic where qdate>=sysdate-10/1440

Maximum query time

select trunc(max(qtime),4) from eq_test.eq$statistic where qdate>=sysdate-10/1440

Number of searches performed

select count(qtime) from eq_test.eq$statistic where qdate>=sysdate-10/1440

Number of searches resulting in no hits

select count(seq) from eq_test.eq$statistic where qdate>=sysdate-10/1440 and numhits=0

Percentile at which target query time is reached

(target time in this example is 2 seconds)

select trunc((percent_rank(2) within group (order by qtime asc) )*100,3) as percentile_rank from eq_test.eq$statistic where qdate>=sysdate-10/1440

2 column metrics

Display Crawler Status

select cs_name, cs_stage from eqsys.eq$_crawler_sched

Top 50 most popular searches by number of hits for previous week

select /*+ NO_MERGE(v) */ * from (select user_query,sum(numhits) from eq_test.eq$sum_stat_popular where sum_date > to_number(to_char(sysdate-7,'YYYYMMDDHH24')) group by user_query order by sum(numhits) desc ) v where rownum < 51

Top 50 most popular searches by number of reqs for previous week

select /*+ NO_MERGE(v) */ * from (select user_query,sum(cnt) from eq_test.eq$sum_stat_popular where sum_date > to_number(to_char(sysdate-7,'YYYYMMDDHH24')) group by user_query order by sum(cnt) desc ) v where rownum < 51

Appendix B – SQL statements for EM Reports

SQL for ad hoc report – 95th percentile and average query time

select met.rollup_timestamp as Day,

met.target_name as Database,

decode(met.column_label,'SES - Current 95th Percentile time','95th Percentile query time',

'SES - Current Avg Query Time','Average Query Time','') as UDM_Name,

met.average,met.minimum,met.maximum

from mgmt$metric_daily met, mgmt$target tgt

where met.target_guid = tgt.target_guid

and met.column_label in

('SES - Current 95th Percentile time','SES - Current Avg Query Time')

and met.rollup_timestamp >

mgmt_view_util.adjust_tz(??EMIP_BIND_START_DATE??, ??EMIP_BIND_TIMEZONE_REGION??, tgt.timezone_region)

and met.rollup_timestamp < mgmt_view_util.adjust_tz(??EMIP_BIND_END_DATE??, ??EMIP_BIND_TIMEZONE_REGION??, tgt.timezone_region)

and met.TARGET_GUID = ??EMIP_BIND_TARGET_GUID??

where the met.column_label values are the names of the UDMs whose data you wish to use. ??VAR_NAME?? signifies EM-specific bind-variables corresponding to values entered by the user (or defaulted) via the UI at run-time.

SQL for Scheduled EM Report – 1 day, 7 day and 30 day data

select

decode(a.column_label,'SES - Current 95th Percentile time','95th Percentile query time',

'SES - Current Avg Query Time','Average Query Time',

'SES - Current Number of Failed Searches','Number of Failed Searches',

'SES - Current Number of Queries','Number of Searches',

'SES - Current Max Query Time','Maximum Query Time',

'SES - Current Percentile for Target Query time','Percentile for Target Query Time','') as Metric,

a.TARGET_NAME as Database,

trunc(avg(a.average),2) as Avg_1day,

trunc(avg(b.average),2) as Avg_7day,

trunc(avg(c.average),2) as Avg_30day,

trunc(avg(a.MAXIMUM),2) Max_1day,

trunc(avg(b.MAXIMUM),2) Max_7day,

trunc(avg(c.MAXIMUM),2) Max_30day

from MGMT$METRIC_DAILY a, MGMT$METRIC_DAILY b, MGMT$METRIC_DAILY c

where

a.target_name in

('oses_server1.',

'ses_server2.'

)

and

b.target_name in

('oses_server1.',

'ses_server2.'

)

and

c.target_name in

('oses_server1.',

'ses_server2.'

)

and

a.column_label in

('SES - Current Avg Query Time',

'SES - Current Max Query Time',

'SES - Current Number of Queries',

'SES - Current 95th Percentile time',

'SES - Current Percentile for Target Query time')

and

b.column_label in

('SES - Current Avg Query Time',

'SES - Current Max Query Time',

'SES - Current Number of Queries',

'SES - Current 95th Percentile time',

'SES - Current Percentile for Target Query time')

and

c.column_label in

('SES - Current Avg Query Time',

'SES - Current Max Query Time',

'SES - Current Number of Queries',

'SES - Current 95th Percentile time',

'SES - Current Percentile for Target Query time')

and a.ROLLUP_TIMESTAMP >=sysdate-2

and b.ROLLUP_TIMESTAMP >=sysdate-8

and c.ROLLUP_TIMESTAMP >=sysdate-31

and c.target_name=a.target_name

and b.target_name=a.target_name

and c.column_label=a.column_label

and b.column_label=a.column_label

group by a.target_name,a.column_label

[pic]

Oracle Secure Enterprise Search

November 2006

Author: Julie Wilson

Oracle Corporation

World Headquarters

500 Oracle Parkway

Redwood Shores, CA 94065

U.S.A.

Worldwide Inquiries:

Phone: +1.650.506.7000

Fax: +1.650.506.7200



Oracle Corporation provides the software

that powers the Internet.

Oracle is a registered trademark of Oracle Corporation. Various

product and service names referenced herein may be trademarks

of Oracle Corporation. All other product and service names

mentioned may be trademarks of their respective owners.

Copyright © 2006 Oracle Corporation

All rights reserved.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download