Wednesday, 11 August 2010

Histogram

A histogram is a frequency distribution (metadata) that describes the distribution of data values within a table.

 Query optimizer

The Oracle Query Optimizer uses histograms to predict better query plans. The ANALYZE command or DBMS_STATS package can be used to compute these histograms.

Column Statistics and Histograms

When gathering statistics on a table, DBMS_STATS gathers information about the data distribution of the columns within the table.
The most basic information about the data distribution is the maximum value and minimum value of the column.
However, this level of statistics may be insufficient for the optimizer's needs if the data within the column is skewed.
For skewed data distributions, histograms can also be created as part of the column statistics to describe the data distribution of a given column. Histograms are described in more details in "Viewing Histograms".
Histograms are specified using the METHOD_OPT argument of the DBMS_STATS gathering procedures. Oracle Corporation recommends setting the METHOD_OPT to FOR ALL COLUMNS SIZE AUTO. With this setting, Oracle automatically determines which columns require histograms and the number of buckets (size) of each histogram. You can also manually specify which columns should have histograms and the size of each histogram.


Viewing Histograms

Column statistics may be stored as histograms. These histograms provide accurate estimates of the distribution of column data. Histograms provide improved selectivity estimates in the presence of data skew, resulting in optimal execution plans with nonuniform data distributions.

No comments:

Post a Comment