Monday, December 8, 2014

Selecting one row



If the table is small do I really need an index to select just one row?

Yes you do.  Here is a simple test you can use yourself to prove this to anyone one who asks.

If I have a small table with say 100 rows that all fit in a single block, it seems reasonable that a full table scan to find the one row is just fine.  After all it’s just one block so what does it matter?  Here is a simple table and some code to fill it with just 100 rows, and they all fit in one 8K block.

CREATE TABLE ONEHUNDREDROWS (ID NUMBER, FILLER VARCHAR2(2))
/

BEGIN
  FOR X IN 1..100 LOOP
      INSERT INTO ONEHUNDREDROWS VALUES (X,'AA');
  END LOOP;
END;
/
Once the table is created you can run this to see that all the rows are in one block, this returns the distinct block numbers in all the ROWIDs for the table:
SQL> SELECT distinct dbms_rowid.rowid_block_number(ROWID, 'BIGFILE') BLOCK from onehundredrows;

          BLOCK
---------------
       26209107
Now here is a block of code to select one row from this table, 5000 times.  The block will time how much elapsed and CPU time are consumed when doing this.   The select used after running the block will show use the resources used.  Of particular interest to us is the BUFFERs column, this shows how many LIOs were done.  Make sure STATISTICS_LEVEL is set to all. 
ALTER SESSION SET STATISTICS_LEVEL=ALL
/

create or replace procedure one_row_test is
   l_start_time pls_integer;
   l_start_cpu  pls_integer;
   y varchar2(2);
begin
   l_start_time := DBMS_UTILITY.GET_TIME;
   l_start_cpu  := DBMS_UTILITY.GET_CPU_TIME;
   for i in 1 .. 5000 loop
      select filler into y from onehundredrows where id = 42;
   end loop;
   DBMS_OUTPUT.put_line ('******* Select one row 5000 times *******');
   DBMS_OUTPUT.put_line ('Times in hundredths of a second');
   DBMS_OUTPUT.put_line ('**** TIME   - '||to_char(DBMS_UTILITY.get_time - l_start_time));
   DBMS_OUTPUT.put_line ('**** CPU    - '||to_char(DBMS_UTILITY.GET_CPU_TIME - l_start_cpu));
end;
/

EXEC ONE_ROW_TEST

SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('9mxb9qk546vu6',NULL,'ALLSTATS LAST'))
/

Here is a run doing a full table scan:

SQL> EXEC ONE_ROW_TEST
******* Select one row 5000 times *******
Times in hundredths of a second
**** TIME   - 26
**** CPU    - 25
SQL>
SQL>
SQL> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('9mxb9qk546vu6',NULL,'ALLSTATS LAST'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9mxb9qk546vu6, child number 0
-------------------------------------
SELECT FILLER FROM ONEHUNDREDROWS WHERE ID = 42

Plan hash value: 35615928

----------------------------------------------------------------------------------------------
| Id  | Operation         | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                |      1 |        |      1 |00:00:00.01 |       7 |
|*  1 |  TABLE ACCESS FULL| ONEHUNDREDROWS |      1 |      1 |      1 |00:00:00.01 |       7 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"=42)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

Notice that the buffers is 7 for the full scan even though all 100 rows fit in one block, the scan has to go all the way to the high water mark of the table.  

Now let’s run the same thing but this time there will be an index on the ID column:


SQL> create index one_id on onehundredrows(id)
  2  /
SQL>
SQL>
SQL> EXEC ONE_ROW_TEST
******* Select one row 5000 times *******
Times in hundredths of a second
**** TIME   - 22
**** CPU    - 22
SQL>
SQL>
SQL> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('9mxb9qk546vu6',NULL,'ALLSTATS LAST'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9mxb9qk546vu6, child number 0
-------------------------------------
SELECT FILLER FROM ONEHUNDREDROWS WHERE ID = 42

Plan hash value: 3262481358

--------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |      1 |        |      1 |00:00:00.01 |       2 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ONEHUNDREDROWS |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  2 |   INDEX RANGE SCAN          | ONE_ID         |      1 |      1 |      1 |00:00:00.01 |       1 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=42)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


The LIOs for this one is 2.  You may say that what is the big deal from 7 to 2?  True enough on a one run bases, this isn’t much to worry about.  But imagine doing this millions of times.  Now that extra 5 LIOs start to add up.

Also notice that in just 5000 look ups the time goes from 22 to 26 centiseconds, about an 18% increase.  Again doesn’t seem like much but it adds up.  Also what happens if this table does grow over time?  The full scan will continue to look at more and more blocks taking more and more time and resources.  The index scan will likely stay about the same for a long time before it starts to change.  The table could easily be about 10 times the size and the index would still be about the same for LIOs and time.

If the index on ID is unique that the time drops a bit more down to 20 centiseconds, so yes it’s even better to have a unique index just to select a single row from a small table.

SQL> drop index one_id
  2  /

SQL> create unique index one_id on onehundredrows(id)
  2  /

SQL> EXEC ONE_ROW_TEST
******* Select one row 5000 times *******
Times in hundredths of a second
**** TIME   - 20
**** CPU    - 20
SQL> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('9mxb9qk546vu6',NULL,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID  9mxb9qk546vu6, child number 0
-------------------------------------
SELECT FILLER FROM ONEHUNDREDROWS WHERE ID = 42

Plan hash value: 3462298723

--------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |      1 |        |      1 |00:00:00.01 |       2 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ONEHUNDREDROWS |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  2 |   INDEX UNIQUE SCAN         | ONE_ID         |      1 |      1 |      1 |00:00:00.01 |       1 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=42)

No comments:

Post a Comment