Sunday, 22 August 2010

Managing Optimizer Statistics

Optimizer statistics include the following:
·         Table statistics
o        Number of rows
o        Number of blocks
o        Average row length
·         Column statistics
o        Number of distinct values (NDV) in column
o        Number of nulls in column
o        Data distribution (histogram)
·         Index statistics
o        Number of leaf blocks
o        Levels
o        Clustering factor
·         System statistics
o        I/O performance and utilization
o        CPU performance and utilization

These statistics should not be confused with performance statistics visible through V$ views.
The optimizer statistics are stored in the data dictionary. They can be viewed using data dictionary views.
Because the objects in a database can be constantly changing, statistics must be regularly updated so that they accurately describe these database objects. Statistics are maintained automatically by Oracle or you can maintain the optimizer statistics manually using the DBMS_STATS package.
The DBMS_STATS package also provides procedures for managing statistics. You can save and restore copies of statistics. You can export statistics from one system and import those statistics into another system. For example, you could export statistics from a production system to a test system. In addition, you can lock statistics to prevent those statistics from changing.

Automatic Statistics Gathering

The recommended approach to gathering statistics is to allow Oracle to automatically gather the statistics. Oracle gathers statistics on all database objects automatically and maintains those statistics in a regularly-scheduled maintenance job. Automated statistics collection eliminates many of the manual tasks associated with managing the query optimizer, and significantly reduces the chances of getting poor execution plans because of missing or stale statistics.

14.2.1 GATHER_STATS_JOB

Optimizer statistics are automatically gathered with the job GATHER_STATS_JOB. This job gathers statistics on all objects in the database which have:
·         Missing statistics
·         Stale statistics
This job is created automatically at database creation time and is managed by the Scheduler. The Scheduler runs this job when the maintenance window is opened. By default, the maintenance window opens every night from 10 P.M. to 6 A.M. and all day on weekends.
The GATHER_STATS_JOB job gathers optimizer statistics by calling the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure. The GATHER_DATABASE_STATS_JOB_PROC procedure collects statistics on database objects when the object has no previously gathered statistics or the existing statistics are stale because the underlying object has been modified significantly (more than 10% of the rows).The DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC is an internal procedure, but its operates in a very similar fashion to the DBMS_STATS.GATHER_DATABASE_STATS procedure using the GATHER AUTO option. The primary difference is that the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure prioritizes the database objects that require statistics, so that those objects which most need updated statistics are processed first. This ensures that the most-needed statistics are gathered before the maintenance window closes.

Enabling Automatic Statistics Gathering

Automatic statistics gathering is enabled by default when a database is created, or when a database is upgraded from an earlier database release. You can verify that the job exists by viewing the DBA_SCHEDULER_JOBS view:
SELECT * FROM DBA_SCHEDULER_JOBS WHERE JOB_NAME = 'GATHER_STATS_JOB';
In situations when you want to disable automatic statistics gathering, the most direct approach is to disable the GATHER_STATS_JOB as follows:
BEGIN
  DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB');
END;
/
If this feature is disabled, then the automatic statistics gathering job is not able to detect stale statistics.
This feature is enabled when the STATISTICS_LEVEL parameter is set to TYPICAL or ALL. TYPICAL is the default value.

When to Use Manual Statistics

Automatic statistics gathering should be sufficient for most database objects which are being modified at a moderate speed. However, there are cases where automatic statistics gathering may not be adequate. Because the automatic statistics gathering runs during an overnight batch window, the statistics on tables which are significantly modified during the day may become stale. There are typically two types of such objects:
·         Volatile tables that are being deleted or truncated and rebuilt during the course of the day.
·         Objects which are the target of large bulk loads which add 10% or more to the object's total size.
For highly volatile tables, there are two approaches:
·         The statistics on these tables can be set to NULL. When Oracle encounters a table with no statistics, Oracle dynamically gathers the necessary statistics as part of query optimization. This dynamic sampling feature is controlled by the OPTIMIZER_DYNAMIC_SAMPLING parameter, and this parameter should be set to a value of 2 or higher. The default value is 2. The statistics can set to NULL by deleting and then locking the statistics:
     BEGIN
       DBMS_STATS.DELETE_TABLE_STATS('OE','ORDERS');
       DBMS_STATS.LOCK_TABLE_STATS('OE','ORDERS');
   END;
   /
·         The statistics on these tables can be set to values that represent the typical state of the table. You should gather statistics on the table when the tables has a representative number of rows, and then lock the statistics.
This is more effective than the GATHER_STATS_JOB, because any statistics generated on the table during the overnight batch window may not be the most appropriate statistics for the daytime workload.

Restoring Previous Versions of Statistics

Whenever statistics in dictionary are modified, old versions of statistics are saved automatically for future restoring. Statistics can be restored using RESTORE procedures of DBMS_STATS package.

Locking Statistics

In some cases, you may want to prevent any new statistics from being gathered on a table or schema by the DBMS_STATS_JOB process, such as highly volatile tables. In those cases, the DBMS_STATS package provides procedures for locking the statistics for a table or schema.

Manual Statistics Gathering

If you choose not to use automatic statistics gathering, then you need to manually collect statistics in all schemas, including system schemas. If the data in your database changes regularly, you also need to gather statistics regularly to ensure that the statistics accurately represent characteristics of your database objects.

 Gathering Statistics with DBMS_STATS Procedures

Statistics are gathered using the DBMS_STATS package. This PL/SQL package is also used to modify, view, export, import, and delete statistics.
The DBMS_STATS package can gather statistics on table and indexes, and well as individual columns and partitions of tables. It does not gather cluster statistics; however, you can use DBMS_STATS to gather statistics on the individual tables instead of the whole cluster.
When you generate statistics for a table, column, or index, if the data dictionary already contains statistics for the object, then Oracle updates the existing statistics. The older statistics are saved and can be restored later if necessary.
Statistics Gathering Procedures in the DBMS_STATS Package
Procedure :              Collects
GATHER_INDEX_STATS       Index statistics
GATHER_TABLE_STATS       Table, column, and index statistics
GATHER_SCHEMA_STATS      Statistics for all objects in a schema
GATHER_DICTIONARY_STATS  Statistics for all dictionary objects
GATHER_DATABASE_STATS    Statistics for all objects in a database

When to Gather Statistics

When gathering statistics manually, you not only need to determine how to gather statistics, but also when and how often to gather new statistics.
For an application in which tables are being incrementally modified, you may only need to gather new statistics every week or every month. The simplest way to gather statistics in these environment is to use a script or job scheduling tool to regularly run the GATHER_SCHEMA_STATS and GATHER_DATABASE_STATS procedures. The frequency of collection intervals should balance the task of providing accurate statistics for the optimizer against the processing overhead incurred by the statistics collection process.
For tables which are being substantially modified in batch operations, such as with bulk loads, statistics should be gathered on those tables as part of the batch operation. The DBMS_STATS procedure should be called as soon as the load operation completes.
For partitioned tables, there are often cases in which only a single partition is modified. In those cases, statistics can be gathered only on those partitions rather than gathering statistics for the entire table. However, gathering global statistics for the partitioned table may still be necessary.

Managing Statistics

http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/stats.htm#i41497

This section discusses:
·         Setting Statistics
·         Handling Missing Statistics

No comments:

Post a Comment