How to Interpret an Explain Plan


What is an explain plan?

An explain plan is a representation of the access path that is taken when a query is executed within Oracle. Query processing can be divided into 7 phases:

  1. Syntactic – checks the syntax of the query
  2. Semantic – checks that all objects exist and are accessible
  3. View Merging – rewrites query as join on base tables as opposed to using views
  4. Statement Transformation – rewrites query transforming some complex constructs into simpler ones where appropriate (e.g. subquery unnesting, in/or transformation). Some transformations use rules while others are costed based upon statistics.
  5. Optimization – determines the optimal access path for the query to take. The Cost Based Optimizer (CBO) uses statistics to analyze the relative costs of accessing objects.
  6. Query Evaluation Plan(QEP) Generation
  7. QEP Execution

Steps [1]-[6] are sometimes grouped under the term ‘Parsing
Step  [7] is the Execution of the statement.

The explain plan is a representation of the access path produced in step 6.

Once the access path has been decided upon, it is stored in the library cache together with the statement itself. Queries are stored in the library cache based upon a hashed representation of the query. When looking for a statement in the library cache, we first apply a hashing algorithm to the current statement and then look for this hash value in the library cache. This access path will be used until the query is re-parsed.

Terminology

Row Source – A row source is a software function that implements specific operations (such as a table scan or a hash join) and returns a set of rows.
Predicates – The where clause of a query
Tuples – rows
Driving Table – This is the row source that we use to seed the query. If this returns a lot of rows then this can have a negative effect on all subsequent operations
Probed Table  – This is the object we look-up data in after we have retrieved relevant key data from the driving table.

How does Oracle access data?

At the physical level Oracle reads blocks of data.The smallest amount of data read is a single Oracle block, the largest is constrained by operating system limits (and multi-block i/o). Logically Oracle finds the data to read by using the following methods:

Full Table Scan (FTS)
Index Look-up (unique & non-unique)
Index Full Scan
Index Fast Full Scan
Table Access by Rowid

If you consider the following simple explain plan:


SQL> set autotrace traceonly explain

SQL> select * from emp;
Execution Plan
----------------------------------------------------------
Plan hash value: 2872589290

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    14 |   532 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| EMP  |    14 |   532 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

The steps in the explain plan are indented to indicate the hierarchy of operations and which steps are dependent on which other steps. When looking at an indented plan, to find which operation is executed first, examine the Operation column.

| Operation         |
---------------------
| SELECT STATEMENT  |
|  TABLE ACCESS FULL|

In this column, the rightmost (ie most indented) uppermost operation is the first thing that is executed. In other words, look down the Operation column starting from the top until you find the operation that is indented the most. This is executed first. In this example this would be the operation with Id=1 (TABLE ACCESS FULL).

In the example, TABLE ACCESS FULL EMP is the first operation that will occur. This statement means we are doing a full table scan of table EMP. When this operation completes then the resultant row source is passed up to the next level of the query for processing. In this case it is the SELECT STATEMENT which is the top of the query.

The other columns in the explain plan provide various pieces of useful information in determining why the plan was chosen:

  • Rows – It tells us the estimated number of rows that the optimizer expects this line of the execution plan to return
  • Bytes It tells us the estimated number of bytes that the optimizer expects this line of the execution plan will return
  • Cost (%CPU) – It’s the optimizer’s estimation of the ‘cost’ and %CPU of the query. The cost allows the optimizer to compare the estimated performance of different plans with each other.
  • Time It’s the optimizer’s estimation of the duration of each step of the query.

Example 1


set autotrace traceonly explain

select ename,dname from emp, dept where emp.deptno=dept.deptno and dept.dname in ('ACCOUNTING','RESEARCH','SALES','OPERATIONS');

Execution Plan
----------------------------------------------------------
Plan hash value: 2865896559
----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |    14 |   308 |     5   (0)| 00:00:01 |
|   1 |  MERGE JOIN                  |         |    14 |   308 |     5   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     4 |    52 |     2   (0)| 00:00:01 |
|   3 |    INDEX FULL SCAN           | PK_DEPT |     4 |       |     1   (0)| 00:00:01 |
|*  4 |   SORT JOIN                  |         |    14 |   126 |     3   (0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL         | EMP     |    14 |   126 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("DEPT"."DNAME"='ACCOUNTING' OR "DEPT"."DNAME"='OPERATIONS' OR

"DEPT"."DNAME"='RESEARCH' OR "DEPT"."DNAME"='SALES')

4 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")

filter("EMP"."DEPTNO"="DEPT"."DEPTNO")

What follows is a walk-through of the plan above:

Execution starts with: ID=3 and it gets there as follows:
Starting with ID=0:


------------------------------------------------
| Id  | Operation                    | Name    |
------------------------------------------------
|   0 | SELECT STATEMENT             |         |
|   1 |  MERGE JOIN                  |         |
|*  2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |
|   3 |    INDEX FULL SCAN           | PK_DEPT |
|*  4 |   SORT JOIN                  |         |
|   5 |    TABLE ACCESS FULL         | EMP     |
------------------------------------------------

ID=0 has no operation above it, so it has no parent but it has 1 child.
ID=0 is the parent of ID=1 and is dependent upon it for rows.You can tell it is the parent because the child is indented.
So ID=1 must be executed prior to ID=0

Moving on to ID=1:


|   0 | SELECT STATEMENT             |         |
|   1 |  MERGE JOIN                  |         |
|*  2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |

|*  4 |   SORT JOIN                  |         |

As before, ID=1 is the child of ID=0.
From the indentation, ID=2 and ID=4 are indented at the same level beneath ID=1. Thus ID=1 is the parent of ID=2 and ID=4 and is dependent upon them for rows. So ID=2 and ID=4 must be executed prior to ID=1

Moving on to ID=2:


|   1 |  MERGE JOIN                  |         |
|*  2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |
|   3 |    INDEX FULL SCAN           | PK_DEPT |

ID=2 is the first child of ID=1.
From the indentation, ID=2 is the parent of ID=3 and is dependent upon it for rows. So ID=3 must be executed prior to ID=2. Moving on to ID=3:


|   1 |  MERGE JOIN                  |         |
|*  2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |
|   3 |    INDEX FULL SCAN           | PK_DEPT |

ID=3 is the (only) child of ID=2.
ID=3 has no child operations. This means that ID=3 is the first step that is executed by the query. Rows are provided to ID=2 from this step.

ID=1 and ID=0 are also dependent on ID=3. Once ID=3 has produced rows, they are passed to ID=2 and that step uses them for whatever operation it is performing. It then provides the processed rows to its parent and so on up the tree. This means that ID=2 is the second step that is executed. ID=1 is not executed next because it has 2 inputs. It needs both of these to be accessed before it can start to operate so now lets look at the second child of ID=1, ID=4:


|   1 |  MERGE JOIN                  |         |
|*  4 |   SORT JOIN                  |         |
|   5 |    TABLE ACCESS FULL         | EMP     |

ID=4 is the second child of ID=1.
ID=4 is the parent of ID=5 and is dependent upon it for rows. ID=5 must be executed prior to ID=4. This means that ID=5 is the third step that is executed followed by ID=4.


|   1 |  MERGE JOIN                  |         |

|*  4 |   SORT JOIN                  |         |
|   5 |    TABLE ACCESS FULL         | EMP     |

Once ID=1 has inputs from both of its children it can execute. ID=1 processes the rows it receives from its dependent steps (ID=2 & ID=4) and returns them to its parent ID=0.
ID=0 returns the rows to the user.

A shortened summary of this is:-


------------------------------------------------
| Id  | Operation                    | Name    |
------------------------------------------------
|   0 | SELECT STATEMENT             |         |
|   1 |  MERGE JOIN                  |         |
|*  2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |
|   3 |    INDEX FULL SCAN           | PK_DEPT |
|*  4 |   SORT JOIN                  |         |
|   5 |    TABLE ACCESS FULL         | EMP     |
------------------------------------------------

  • Start with ID=0: SELECT STATEMENT but this is dependent on its child objects
  • So it looks at its first child step:  ID=1  MERGE JOIN but this is dependent on its child objects
  • So it looks at its first child step:  ID=2   TABLE ACCESS BY INDEX ROWID DEPT but this is dependent on its child object
  • So it looks at its only child step:  ID=3   INDEX FULL SCAN PK_DEPT. This has no children so this is executed.
  • Rows from ID=3 are fed back to ID=2
  • Rows from ID=2 are fed back to ID=1 but this has 2 children so the other child ID=4 needs to be explored
  • So it looks at its second child step:  ID=4   SORT JOIN but this is dependent on its child object
  • So it looks at its only child step:  ID=5   TABLE ACCESS FULL EMP This has no children so this is executed.
  • Rows from ID=5 are fed back to ID=4
  • Rows from ID=4 are fed back to ID=1. Since both children have now supplied rows, ID=1 can be executed.
  • Rows from ID=1 are fed back to ID=0
  • Rows from ID=0 are fed back to the client
  • Rows are returned to the parent step(s) until finished