Extended Statistics

Real data often shows correlations between column values. Because the optimizer does not know these real-world relationships, it will have difficulty estimating the cardinality and may pick sub-optimal execution plans for your queries. For example, it may join tables in an inefficient order or ignore the most efficient index etc.

Let’s create a very simple table to demonstrate how optimizer miscalculates the cardinality when there are correlated columns in WHERE clause:


We’ll gather statistics of our table:

We gathered stats and the optimizer will show the correct cardinality for queries with single predicates. “E-Rows” means estimated rows (cardinality):

In next query, we will use two predicates. Logically one of them is ineffective but optimizer will think that both predicates will filter the rows:

As you see, the optimizer assumes each predicates will filter the rows so it expects only one returns from full table scan. In Oracle 11g, we can create extended statistics, so the optimizer can include the correlation factor between the columns when estimating the cardinality.

Here’s the function to create extended statistics:

DBMS_STATS.CREATE_EXTENDED_STATS(
ownname VARCHAR2,
tabname VARCHAR2,
extension VARCHAR2)
RETURN VARCHAR2;

Extension can be either a column group or an expression. Suppose the specified table has two columns c1, c2. An example column group can be “(c1, c2)” and an example expression can be “(c1 + c2)”.

This function creates a column statistics entry in the system for a user specified column group or an expression in a table. Statistics for this extension will be gathered when user or auto statistics gathering job gathers statistics for the table.

Now the optimizer knows that there’s a correlation between these columns, and makes a better estimation for the cardinality.

Here are some notes about extended statistics:

1) An extension cannot contain a virtual column.
2) You can not create extended statistics on the tables owned by SYS.
3) You can not create extended statistics on cluster tables, index organized tables, temporary tables, external tables.
4) COMPATIBLE parameter needs to be 11.0.0.0.0 or greater.
5) Extended statistics only works with equality predicates and in-lists.

It’s possible to drop the extended statistics:

PROCEDURE DROP_EXTENDED_STATS(
ownname VARCHAR2,
tabname VARCHAR2,
extension VARCHAR2);

Please share
  •  
  •  
  •  
  •  
  •  
  •  

AWS Big Data Specialist. Oracle Certified Professional (OCP) for EBS R12, Oracle 10g and 11g. Co-author of "Expert Oracle Enterprise Manager 12c" book published by Apress. Awarded as Oracle ACE (in 2011) and Oracle ACE Director (in 2016) for the continuous contributions to the Oracle users community. Founding member, and vice president of Turkish Oracle User Group (TROUG). Presented at various international conferences including Oracle Open World.

Leave Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.