且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

[20130809]12c Clustering Factor(2).txt

更新时间:2022-09-06 23:27:46

[20130809]12c Clustering Factor(2).txt

参考以下链接,自己重复测试:

http://richardfoote.wordpress.com/2013/05/14/clustering-factor-calculation-improvement-part-ii-blocks-on-blocks/


1.测试环境:
SQL> @ver

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0

2.测试1:
SQL> create table major_tom (id number, DOB date, text varchar2(30));
Table created.

SQL> insert into major_tom select rownum,  sysdate-trunc(dbms_random.value(0, 20000)), 'DAVID BOWIE' from dual connect by level
2000000 rows created.

SQL> commit;
Commit complete.

SQL> create index major_tom_dob_i on major_tom(dob);
Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'MAJOR_TOM', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1');
PL/SQL procedure successfully completed.

SQL>SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i WHERE
t.table_name = i.table_name AND i.index_name='MAJOR_TOM_DOB_I';

TABLE_NAME INDEX_NAME                         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
---------- ------------------------------ ---------- ---------- -----------------
MAJOR_TOM  MAJOR_TOM_DOB_I                      9077    2000000           1988246

--可以发现插入的日期非常随机,CF很大。接近NUM_ROWS。

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'MAJOR_TOM', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);
PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'MAJOR_TOM_DOB_I', estimate_percent=> null);
PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i WHERE
t.table_name = i.table_name AND i.index_name='MAJOR_TOM_DOB_I';

TABLE_NAME INDEX_NAME                         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
---------- ------------------------------ ---------- ---------- -----------------
MAJOR_TOM  MAJOR_TOM_DOB_I                      9077    2000000           1941549

--可以发现CF几乎没有变化,255块被cache,相比9077块仅仅占用很少的一部分。
--看来这个参数对CF的增强非常有限,特别是一些大表,几乎没有作用。

3.测试2:
SQL> create table ziggy (id number, code number, text varchar2(30));
Table created.

SQL> insert into ziggy select rownum,  trunc(dbms_random.value(0, 100)), 'DAVID BOWIE' from dual connect by level
2000000 rows created.

SQL> commit;
Commit complete.

SQL> create index ziggy_code_i on ziggy(code);
Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY', estimate_percent=> null, cascade=> true,method_opt=>'FOR ALL COLUMNS SIZE 1');
PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i WHERE
t.table_name = i.table_name AND i.index_name='ZIGGY_CODE_I';

TABLE_NAME INDEX_NAME                         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
---------- ------------------------------ ---------- ---------- -----------------
ZIGGY      ZIGGY_CODE_I                         7048    2000000            662624

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'ZIGGY', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);
PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'ZIGGY_CODE_I', estimate_percent=>null);
PL/SQL procedure successfully completed.

SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
FROM user_tables t, user_indexes i
WHERE t.table_name = i.table_name AND i.index_name='ZIGGY_CODE_I';

TABLE_NAME INDEX_NAME                         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
---------- ------------------------------ ---------- ---------- -----------------
ZIGGY      ZIGGY_CODE_I                         7048    2000000            662624

--可以发现几乎没有变化,按照作者的解析:

    The table has some 7048 blocks but each distinct CODE value has some 20,000 occurrences on average. Therefore, each
value is going to be found 20000/7048 = roughly 2 to 3 times per block. As the index is in CODE order and for each CODE
in rowid order, the CF is going to increment for each CODE value for each distinct block we visit. We will therefore
only go back to a previously visited table block (except for the 2 to 3 visits to the current block) when the CODE value
changes but this will take us all the way back to the first block which is always going to be some 7047 blocks away from
the current one. As 7047 is much greater than the 255 the CF calculation will only cater for, the CF is going to remain
unchanged from the default calculation as a result.

-- code 的变化仅仅从0-99,100种变化,每个值出现20,000次,这样1个值在一个块中出现 20000/7048 ,约2-3次,这样cache=255,对于
-- CF的计算影响很小。

--总之,这个参数在许多情况下,改进很小。