且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

[20141217]记录长度与块大小.txt

更新时间:2021-07-19 21:42:25

[20141217]记录长度与块大小.txt

--昨天看了http://savvinov.com/2014/12/15/4k-bug-is-not-a-bug/
--提到转载:
A couple of weeks back I received an update on my "4k bug" SR (slow multirow DML performance for small block sizes). As
it turns out, the observed behavior is not a bug. It's actually more interesting than that. It is an undocumented
feature that controls the insert mode depending on the estimated row size (the sum of all column sizes). If the
estimated column size exceeds the size of the block, then Oracle switches to row-by-row processing.

--记录长度超出块大小,oracle插入会交换到row-by-row processing方式,自己在8k的数据块测试看看。

SCOTT@test> @ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.3.0     Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

create table t1(x number, x2 varchar2(4000),x3 varchar2(4000))  SEGMENT CREATION IMMEDIATE;
create table t2(x number, x2 varchar2(1000),x3 varchar2(1000))  SEGMENT CREATION IMMEDIATE;
--消除段延迟的影响。

set autotrace traceonly

insert into t1 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825

------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | INSERT STATEMENT              |      |     1 |     2   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL      | T1   |       |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(LEVELStatistics
----------------------------------------------------------
        403  recursive calls
     128692  db block gets
       3553  consistent gets
       3877  physical reads
   46041644  redo size
        840  bytes sent via SQL*Net to client
        852  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
     100000  rows processed

insert into t2 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level     
Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825
------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | INSERT STATEMENT              |      |     1 |     2   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL      | T2   |       |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(LEVELStatistics
----------------------------------------------------------
        358  recursive calls
      30132  db block gets
       6227  consistent gets
       3094  physical reads
   23675572  redo size
        843  bytes sent via SQL*Net to client
        852  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
     100000  rows processed

-- 注意观察redo大小,前者46041644,后者23675572。
--46041644/23675572=1.94,增加了1倍。

exec sys.dbms_stats.gather_table_stats ( OwnName => 'SCOTT',TabName => 't1',Estimate_Percent => NULL,Method_Opt => 'FOR ALL COLUMNS SIZE 1 ',Cascade => True ,No_Invalidate => false)
exec sys.dbms_stats.gather_table_stats ( OwnName => 'SCOTT',TabName => 't2',Estimate_Percent => NULL,Method_Opt => 'FOR ALL COLUMNS SIZE 1 ',Cascade => True ,No_Invalidate => false)

SCOTT@test> set autotrace  off
SCOTT@test> select table_name,blocks,avg_row_len from dba_tables where owner=user and table_name in ('T1','T2');
TABLE_NAME     BLOCKS AVG_ROW_LEN
---------- ---------- -----------
T2               3016         207
T1               3016         207
--可以看到占用块大小一样。

--使用更小的表结构看看。
create table t3(x number, x2 varchar2(100),x3 varchar2(100))  SEGMENT CREATION IMMEDIATE;
set autotrace traceonly
insert into t3 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level

Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825
------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | INSERT STATEMENT              |      |     1 |     2   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL      | T3   |       |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(LEVELStatistics
----------------------------------------------------------
        421  recursive calls
      30203  db block gets
       6270  consistent gets
       3083  physical reads
   23683316  redo size
        843  bytes sent via SQL*Net to client
        852  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
     100000  rows processed
--基本与t2的插入一样。


--转载作者的内容:

The query below can identify the tables that will have slow bulk DML because of the high maximum row length:

select c.owner,
       c.table_name,
       sum(data_length) est_row_length,
       ts.block_size
from dba_tab_columns c,
     (select owner, table_name, tablespace_name
      from dba_tables
      union select table_owner owner, table_name, tablespace_name
      from dba_tab_partitions
     ) t,
     dba_tablespaces ts
where c.owner = t.owner
and c.table_name = t.table_name
and ts.tablespace_name = t.tablespace_name
group by ts.block_size, c.owner, c.table_name
having sum(data_length)>=ts.block_size
order by 1, 2

--说明他的脚本查询的是sum(data_length)>=ts.block_size,应该查不到T1。

For the tables returned by this query, bulk DML will be internally performed row-by-row, decreasing performance and
increasing redo generation. If you want to change this, you'll need to either shrink column size, or migrate the table
to a tablespace with a larger block size.

--这个算给表结构设计不合理,一上来不管如何varchar2都是4000的一个建议,不要再这样设计表结构,根据业务选择合适的长度才是正
--道。当然这个对于批量操作才有影响。
--什么是row-by-row processing,我给看看这方面的文档,再写篇blog。