Thursday, 12 August 2010

Bitmap Index and BTree Index

http://www.akadia.com/services/ora_bitmapped_index.html

Bitmap Indexes:

Overview
Oracle's two major index types are Bitmap indexes and B-Tree indexes. B-Tree indexes are the regular type that OLTP systems make much use of, and bitmap indexes are a highly compressed index type that tends to be used primarily for data warehouses.
Characteristic of Bitmap Indexes
>>For columns with very few unique values (low cardinality)
Columns that have low cardinality are good candidates (if the cardinality of a column is <= 0.1 %  that the column is ideal candidate, consider also 0.2% – 1%)
>>Tables that have no or little insert/update are good candidates (static data in warehouse)

Advantage of Bitmap Indexes
The advantages of them are that they have a highly compressed structure, making them fast to read and their structure makes it possible for the system to combine multiple indexes together for fast access to the underlying table.
Compressed indexes, like bitmap indexes, represent a trade-off between CPU usage and disk space usage.
Disadvantage of Bitmap Indexes
The reason for confining bitmap indexes to data warehouses is that the overhead on maintaining them is enormous. A modification to a bitmap index requires a great deal more work on behalf of the system than a modification to a b-tree index. In addition, the concurrency for modifications on bitmap indexes is dreadful.


The real benefit of bitmapped indexing occurs when one table includes multiple bitmapped indexes. Each individual column may have low cardinality. The creation of multiple bitmapped indexes provides a very powerful method for rapidly answering difficult SQL queries.

For example, assume there is a motor vehicle database with numerous low-cardinality columns such as car_color, car_make, car_model, and car_year. Each column contains less than 100 distinct values by themselves, and a b-tree index would be fairly useless in a database of 20 million vehicles. However, combining these indexes together in a query can provide blistering response times a lot faster than the traditional method of reading each one of the 20 million rows in the base table. For example, assume we wanted to find old blue Toyota Corollas manufactured in 1981:

select
   license_plat_nbr
from
   vehicle
where
   color = ‘blue’
and
   make = ‘ toyota
and
   year = 1981;

Oracle uses a specialized optimizer method called a bitmapped index merge to service this query. In a bitmapped index merge, each Row-ID, or RID, list is built independently by using the bitmaps, and a special merge routine is used in order to compare the RID lists and find the intersecting values. As the number if distinct values increases, the size of the bitmap increases exponentially, such that an index with 100 values may perform thousands of times faster than a bitmap index on 1,000 distinct column values.  Also, remember that bitmap indexes are only suitable for static tables and materialized views which are updated at nigh and rebuilt after batch row loading.  If your tables are not read-only during query time, DO NOT consider using bitmap indexes! 1 - 7 distinct key values - Queries against bitmap indexes with a low cardinality are very fast.
8-100 distinct key values - As the number if distinct values increases, performance decreases proportionally.
100 - 10,000 distinct key values - Over 100 distinct values, the bitmap indexes become huge and SQL performance drops off rapidly.
Over 10,000 distinct key values - At this point, performance is ten times slower than an index with only 100 distinct values.
 Oracle Bitmap indexes are a very powerful Oracle feature, but they can be tricky!
You will want a bitmap index when:
1 - Table column is low cardinality - As a ROUGH guide, consider a bitmap for any index with less than 100 distinct values

    select region, count(*) from sales group by region;
2 - The table has LOW DML - You must have low insert./update/delete activity.  Updating bitmapped indexes take a lot of resources, and bitmapped indexes are best for largely read-only tables and tables that are batch updated nightly.
3 - Multiple columns - Your SQL queries reference multiple, low cardinality values in there where clause.  Oracle cost-based SQL optimizer (CBO) will scream when you have bitmap indexes on . 
Troubleshooting Oracle bitmap indexes:
Some of the most common problems when implementing bitmap indexes include:
1. Small table - The CBO may force a full-table scan if your table is small!
2. Bad stats - Make sure you always analyze the bitmap with dbms_stats right after creation:
CREATE BITMAP INDEX
emp_bitmap_idx
ON index_demo (gender);

exec dbms_stats.gather_index_stats(OWNNAME=>'SCOTT', INDNAME=>'EMP_BITMAP_IDX');

  3. Test with a hint - To force the use of your new bitmap index, just use a Oracle INDEX hint:
select /*+ index(emp emp_bitmap_idx) */
   count(*)
from
   emp, dept
where
   emp.deptno = dept.deptno;http://www.dba-oracle.com/t_garmany_easysql_btree_index.htm
B-Tree Index
By default, the Oracle creates a b_tree index.  In a b-tree, you walk the branches until you get to the node that has the data you want to use.  In the classic b-tree structure, there are branches from the top that lead to leaf nodes that contain the data.  If I wanted to find the rowid for the number 28 in the b-tree defined in Figure 5.3, I would start at the top or header block.
Is my number greater or less that 50?  Well 28 is less than 50, so I move to the branch marked 25.  Is 28 greater or less that 25?  Since 28 is greater than 25, I move to the leaf node marked 26-49.  I scan this node for the rowid of the number 28.  The key to the b-tree in Figure 5.3 is that I can find any number from one to 100 by reading no more than three nodes. 
The Oracle database implements the b-tree index in a little different manner.  An Oracle b-tree starts with only two nodes, one header and one leaf.  The header contains a pointer to the leaf block and the values stored in the leaf block.  As the index grows leaf bocks are added to the index (Figure 5.4).
To find a specific row, we look at the header to find the range of values in each leaf and then go directly to the leaf node that contains the value we are looking for.  In the index in Figure 5.4, any row can be found by reading two nodes.  Since the header contains only pointers to leaf blocks, a single header node can support a very large number (hundreds) of leaf nodes. 
If the header block fills, then a new header block is established, and the former header node becomes a branch node.  This is called a three level b-tree (Figure 5.5).
In Figure 5.5, you can find any value in any leaf node by reading no more than three blocks.  I can also create a multicolumn index, also called a concatenated or complex index.
SQL> create index sales_keys
  2  on sales (book_key, store_key, order_number);
Index created.
Here, we created an index called sales_keys on three columns of the sales table.  A multicolumn index can be used by the database but only from the first or lead column.  Our sales_keys index can be used in the following query.
select
  order_number,
  quantity
from
  sales
where
   book_key = 'B103';
Note that the lead column of the index is the book_key, so the database can use the index in the query above.  I can also use the sales_keys index in the queries below.
select
  order_number,
  quantity
from
  sales
where
   book_key = 'B103'
and
   store_key = 'S105'
and
   order_number = 'O168';
However, the database cannot use that index in the following query because the WHERE clause does not contain the index lead column.
select
  order_number,
  quantity
from
  sales
where
   store_key = 'S105'
and
   order_number = 'O168';
Also, note that in the query below, the database can answer the query from the index and so will not access the table at all.
 select
  order_number
from
  sales
where
   store_key = 'S105'
and
   book_key = 'B108';
As you can see, b-tree indexes are very powerful.  You must remember that a multicolumn index cannot skip over columns, so the lead index column must be in the WHERE clause filters.



DBA: Performance and Availability
Bitmap Index vs. B-tree Index: Which and When?
by Vivek Sharma

Understanding the proper application of each index can have a big impact on performance.
Conventional wisdom holds that bitmap indexes are most appropriate for columns having low distinct values—such as GENDER, MARITAL_STATUS, and RELATION. This assumption is not completely accurate, however. In reality, a bitmap index is always advisable for systems in which data is not frequently updated by many concurrent systems. In fact, as I'll demonstrate here, a bitmap index on a column with 100-percent unique values (a column candidate for primary key) is as efficient as a B-tree index.
In this article I'll provide some examples, along with optimizer decisions, that are common for both types of indexes on a low-cardinality column as well as a high-cardinality one. These examples will help DBAs understand that the usage of bitmap indexes is not in fact cardinality dependent but rather application dependent.

Comparing the Indexes

There are several disadvantages to using a bitmap index on a unique column—one being the need for sufficient space (and Oracle does not recommend it). However, the size of the bitmap index depends on the cardinality of the column on which it is created as well as the data distribution. Consequently, a bitmap index on the GENDER column will be smaller than a B-tree index on the same column. In contrast, a bitmap index on EMPNO (a candidate for primary key) will be much larger than a B-tree index on this column. But because fewer users access decision-support systems (DSS) systems than would access transaction-processing (OLTP) ones, resources are not a problem for these applications.

No comments:

Post a Comment