You might face this question if you are going though job interviews. and these are some cases where the Optimizer thinks going for a full table scan is better than index scan.

1:- The selectivity of the records is very high:

We have this table and range of contents:

    SQL> desc HR.ORDERS;
     Name		    Null?    Type
     ---------------------  --------- ---------------------------------------
     ORDER_ID 	 	    NOT NULL  NUMBER(38)
     CUSTOMER			      VARCHAR2(50)
     ITEM			      VARCHAR2(50)
     EVENT_DATE			      DATE
    
    SQL> 

    SQL> select CUSTOMER, count(*) from ORDERS group by CUSTOMER order by 1,2;
    
    CUSTOMER			      COUNT(*)
    --------------------------------  ----------
    JOHN				      10
    PEDRO				      10
    PEPE				      10
    RUBEN				   10000
    TOM					      10
    
    SQL> 

We have an index and it is valid:

    SQL> CREATE INDEX ORDERS_IDX 
    ON ORDERS(CUSTOMER);  
    Index created.
    SQL> 
    
    
     col INDEX_NAME format a20
    SELECT 
        index_name, 
        index_type, 
        visibility, 
        status 
    FROM 
        all_indexes
    WHERE 
        table_name = 'ORDERS';
    
    INDEX_NAME	     INDEX_TYPE 		 VISIBILIT STATUS
    -------------------- --------------------------- --------- --------
    ORDERS_IDX	     NORMAL			 VISIBLE   VALID

Lets get the plan for a lower count:

EXPLAIN PLAN FOR
select * from ORDERS where CUSTOMER = 'TOM';

SELECT 
    PLAN_TABLE_OUTPUT 
FROM 
    TABLE(DBMS_XPLAN.DISPLAY());



PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------
Plan hash value: 3046898238

--------------------------------------------------------------------------------------------------
| Id  | Operation			    | Name	 | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		    |		 |    10 |   760 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS	 |    10 |   760 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN		    | ORDERS_IDX |    10 |	 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CUSTOMER"='TOM')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

18 rows selected.

SQL> 

So far, so good, query is using an index.

Lets try now with the higher count:

EXPLAIN PLAN FOR
select * from ORDERS where CUSTOMER = 'RUBEN';

SELECT 
    PLAN_TABLE_OUTPUT 
FROM 
    TABLE(DBMS_XPLAN.DISPLAY());


    PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 1275100350

----------------------------------------------------------------------------
| Id  | Operation	  | Name   | Rows  | Bytes | Cost (%CPU)| Time	   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	   | 10000 |   742K|	17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| ORDERS | 10000 |   742K|	17   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CUSTOMER"='RUBEN')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

17 rows selected.

SQL> 

In this case the OPTIMIZER thinks it is better to do a full table scan. Lets see the COST of forcing to use the index:

EXPLAIN PLAN FOR
select /*+ INDEX(ORDERS ORDERS_IDX) */ * from ORDERS where CUSTOMER = 'RUBEN';

SELECT 
    PLAN_TABLE_OUTPUT 
FROM 
    TABLE(DBMS_XPLAN.DISPLAY());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3046898238

--------------------------------------------------------------------------------------------------
| Id  | Operation			    | Name	 | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		    |		 | 10000 |   742K|    84   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS	 | 10000 |   742K|    84   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN		    | ORDERS_IDX | 10000 |	 |    29   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CUSTOMER"='RUBEN')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

18 rows selected.

SQL> 

As we can see, the cost of using the index is way higher than doing a full table scan.

Last modified: 8 November 2021

Author

Comments

Write a Reply or Comment

Your email address will not be published.