Mssm及assm下索引叶块分裂的测试

对 Oracle 中索引叶块分裂的测试本文作者：刘相兵 ([email protected])

摘要： MSSM 方式是不能避免索引分裂引起的超时问题，Coalesce 合并索引

是目前既有的最具可操作性且无副作用的解决方案。

在版本 10.2.0.4未打上相关 one-off补丁的情况下，分别对 ASSM和 MSSM管理

模式表空间进行索引分裂测试，经过测试的结论如下：

在 10gr2 版本中 MSSM 方式是不能避免索引分裂引起交易超时问题；

10.2.0.4 上的 one-off 补丁因为目前仅存在 Linux 版本，可以考虑声请

补丁后具体测试（因目前没有补丁所以处于未知状态）。

合并索引是目前最具可行性的解决方案(alter index coalesce)。

最新的 11gr2 中经测试仍存在该问题。

具体测试过程如下：

1．自动段管理模式下的索引块分裂SQL> drop tablespace idx1 including contents and datafiles;Tablespace dropped.

SQL> create tablespace idx1 datafile '?/dbs/idx1.dbf' size 500M 2 segment space management AUTO 3 extent management local uniform size 10M;--创建自动段管理的表空间

Tablespace created.

SQL> create table idx1(a number) tablespace idx1;Table created.

create index idx1_idx on idx1 (a) tablespace idx1 pctfree 0;Index created. -- 创建实验对象表及索引

www.oracledatabase12g.com 第 1 页 11年 7月 8日

SQL> insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000; -- 插入 25万条记录

250000 rows created.SQL> commit;Commit complete.

SQL>create table idx2 tablespace idx1 as select * from idx1 where 1=2;Table created.insert into idx2select * from idx1 where rowid in(select rid from(select rid, rownum rn from(select rowid rid from idx1 where a between 10127 and 243625 order by a) --取出后端部分记录,即每 250条取一条

)where mod(rn, 250) = 0)/933 rows created.

SQL> commit;Commit complete.

SQL> analyze index idx1_idx validate structure; --分析原索引

select blocks,lf_blks,del_lf_rows from index_stats;Index analyzed.

SQL>

BLOCKS LF_BLKS DEL_LF_ROWS---------- ---------- -----------

1280 499 0 -- 未删除情况下 499个叶块

SQL> delete from idx1 where a between 10127 and 243625; -- 大量删除

commit;

233499 rows deleted.


SQL> SQL> Commit complete.

SQL> analyze index idx1_idx validate structure;select blocks,lf_blks,del_lf_rows from index_stats;Index analyzed.SQL>


1280 499 233499 -- 删除后叶块数量不变

SQL> insert into idx1 select * from idx2; -- 令那些 empty块，不再 empty，但每个块中只有一到二条记录，空闲率仍为 75-100%

commit;933 rows created. Commit complete.

SQL> insert into idx1 select 250000+rownum from all_objects where rownum <= 126; -- 造成 leaf块分裂前提

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like '%split%' and sid=(select distinct sid from v$mystat);

VALUE NAME---------- ----------------------------------------------------------------

997 leaf node splits 997 leaf node 90-10 splits 0 branch node splits 0 queue splits --找出当前会话目前的叶块分裂

次数

SQL>insert into idx1 values (251000); -- 此处确实叶块分裂 1 row created.

SQL> commit;


Commit complete.SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like '%split%' and sid=(select distinct sid from v$mystat);

VALUE NAME---------- ----------------------------------------------------------------

998 leaf node splits 998 leaf node 90-10 splits 0 branch node splits 0 queue splits -- 可以看到对比之前的查询多了一个叶块

分裂

SQL> set linesize 200 pagesize 1500;SQL> select executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql 2 where sql_text like '%insert%idx1%' and sql_text not like '%v$sql%';

EXECUTIONS BUFFER_GETS DISK_READS CPU_TIME ELAPSED_TIME ROWS_PROCESSED---------- ----------- ---------- ---------- ------------ --------------SQL_TEXT--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

1 1603 0 271601 271601 933insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1 156 0 82803 82803 126insert into idx1 select 250000+rownum from all_objects where rownum <= 126 1 177 0 3728 3728 1


insert into idx1 values (251000) -- 读了那些实际不空的块，较多

buffer_get

1 1409 0 40293 40293 933insert into idx1 select * from idx2

1 240842 0 3478341 3478341 250000

SQL> insert into idx1 values (251001); -- 不分裂的插入

1 row created.

SQL> commit;

Commit complete.

SQL> select executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql 2 where sql_text like '%insert%idx1%' and sql_text not like '%v$sql%';



1 156 0 82803 82803 126insert into idx1 select 250000+rownum from all_objects where


rownum <= 126

1 9 0 1640 1640 1insert into idx1 values (251001) --不分裂的插入，少量 buffer_gets

1 177 0 3728 3728 1insert into idx1 values (251000)


1 240842 0 3478341 3478341 250000insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000

如演示1所示，在自动段管理模式下大量删除后插入造成许多块为 75%-

100%空闲率且不完全为空，此后叶块分裂时将引起插入操作的相关前台

进程扫描大量“空块“，若这些块不在内存中（引发物理读）且可能需

要延迟块清除等原因时，减缓了该扫描操作的速度，造成叶块分裂缓慢，

最终导致了其他 insert 操作被 split 操作所阻塞，出现 enq:tx index

contention等待事件。

2．手动段管理模式下的索引块分裂SQL> drop tablespace idx1 including contents and datafiles;

Tablespace dropped.

SQL> create tablespace idx1 datafile '?/dbs/idx1.dbf' size 500M 2 segment space management MANUAL -- MSSM的情况

3 extent management local uniform size 10M;

Tablespace created.

SQL> create table idx1(a number) tablespace idx1;

create index idx1_idx on idx1 (a) tablespace idx1 pctfree 0;

Table created.


SQL> SQL> insert into idx1 select rownum from all_objects, all_objects where rownum <= 250Index created.

SQL> SQL> 000;

commit;

create table idx2 tablespace idx1 as select * from idx1 where 1=2;

insert into idx2select * from idx1 where rowid in(select rid from(select rid, rownum rn from(select rowid rid from idx1 where a between 10127 and 243625 order by a))where mod(rn, 250) = 0)/

commit;

250000 rows created.


SQL> SQL> Table created.

SQL> SQL> 2 3 4 5 6 7 8 9 933 rows created.


SQL> analyze index idx1_idx validate structure;select blocks,lf_blks,del_lf_rows from index_stats;Index analyzed.

SQL>



1280 499 0

SQL> delete from idx1 where a between 10127 and 243625;


SQL> commit;

Commit complete.

SQL> insert into idx1 select * from idx2;

commit;

933 rows created.

SQL> SQL>

Commit complete.

SQL> SQL> insert into idx1 select 250000+rownum from all_objects where rownum <= 126;

commit;126 rows created.

SQL> SQL>

Commit complete.

SQL> SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like '%split%' and sid=(select distinct sid from v$mystat);

VALUE NAME---------- ----------------------------------------------------------------

1496 leaf node splits


1496 leaf node 90-10 splits 0 branch node splits 0 queue splits

SQL> insert into idx1 values (251000); -- 确实分裂

1 row created.

SQL> commit;

Commit complete.


VALUE NAME---------- ----------------------------------------------------------------

1497 leaf node splits 1497 leaf node 90-10 splits 0 branch node splits 0 queue splits-- 以上与 ASSM时完全一致



1 1553 0 283301 283301 933insert into idx2 select * from idx1 where rowid in (select rid


from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1 153 0 78465 78465 126insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1 963 0 10422 10422 1 -- 比 ASSM模式下更大量的“空块”读

insert into idx1 values (251000)



SQL> insert into idx1 values (251001);

1 row created.

SQL> commit;

Commit complete.


EXECUTIONS BUFFER_GETS DISK_READS CPU_TIME ELAPSED_TIME ROWS_PROCESSED---------- ----------- ---------- ---------- ------------ --------------SQL_TEXT------------------------------------------------------------------------------------------------------------------------------


--------------------------------------------------------------------------



1 7 0 1476 1476 1insert into idx1 values (251001) --不分裂的情况与 ASSM时一致




6 rows selected.

如演示2所示，MSSM情况下叶块分裂读取了比ASSM模式下更多的“空块

“；MSSM并不能解决大量删除后叶块分裂需要扫描大量非空块的问题，

实际上可能更糟糕。从理论上讲 MSSM 的 freelist 只能指出那些未达到

pctfree和曾经到达pctfree后来删除记录后使用空间下降到 pctused的

块（doc：A free list is a list of free data blocks that usually

includes blocks existing in a number of different extents

within the segment. Free lists are composed of blocks in which

free space has not yet reached PCTFREE or used space has shrunk


below PCTUSED.），换而言之MSSM模式下”空块“会更多。

3．自动段管理模式下coalesce后的索引块分裂SQL> drop tablespace idx1 including contents and datafiles;

Tablespace dropped.

SQL> create tablespace idx1 datafile '?/dbs/idx1.dbf' size 500M 2 segment space management AUTO -- ASSM 下 coalesce情况


Tablespace created.



Table created.

SQL> SQL>

Index created.

SQL> SQL> insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000;

commit;




commit;






SQL> SQL> SQL> SQL> SQL> analyze index idx1_idx validate structure;select blocks,lf_blks,del_lf_rows from index_stats;Index analyzed.

SQL>


1280 499 0


commit;



SQL> alter index idx1_idx coalesce;

Index altered.



SQL>

BLOCKS LF_BLKS DEL_LF_ROWS---------- ---------- ----------- 1280 33 0 -- coalesc后 lf块合并了


933 rows created.

SQL> SQL> commit;

Commit complete.



SQL> SQL>

Commit complete.


VALUE NAME---------- ----------------------------------------------------------------

1999 leaf node splits 1995 leaf node 90-10 splits 0 branch node splits 0 queue splits


SQL> insert into idx1 values (251000); -- 确实分裂

1 row created.

SQL> commit;

Commit complete.


VALUE NAME---------- ----------------------------------------------------------------







1 23 0 2218 2218 1 --少量 buffer gets insert into idx1 values (251000)




1 row created.

SQL> commit;

Commit complete.



1 1603 0 268924 268924


933insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )






6 rows selected.

如演示三所示在删除后进行coalesce操作，合并操作将大量空块分离出

了索引结构(move empty out of index structure)，之后的叶块分裂仅

读取了少量必要的块。

4．手动段管理模式下coalesce后的索引块分裂SQL> drop tablespace idx1 including contents and datafiles;

Tablespace dropped.

SQL> create tablespace idx1 datafile '?/dbs/idx1.dbf' size 500M 2 segment space management MANUAL -- mssm情况下 coalesce



Tablespace created.



Table created.

SQL> SQL> insert into idx1 select rownum from all_objects, all_objects where rownum <= 250Index created.

SQL> SQL> 000;

commit;



commit;







SQL> SQL> SQL> SQL> SQL> analyze index idx1_idx validate structure;select blocks,lf_blks,del_lf_rows from index_stats;Index analyzed.

SQL>


1280 499 0


commit;




SQL>


1280 499 233499

SQL> alter index idx1_idx coalesce;

Index altered.

SQL> analyze index idx1_idx validate structure;


select blocks,lf_blks,del_lf_rows from index_stats;Index analyzed.

SQL>


1280 33 0


933 rows created.

SQL> SQL> commit;

Commit complete.



SQL> SQL>

Commit complete.


VALUE NAME---------- ----------------------------------------------------------------


SQL> insert into idx1 values (251000); --


确实分裂

1 row created.

SQL> commit;

Commit complete.


VALUE NAME---------- ----------------------------------------------------------------





1 153 0 77817 77817


126insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1 19 0 2010 2010 1 -- 少量 buffer get insert into idx1 values (251000)




1 row created.

SQL> commit;

Commit complete.



1 1553 0 281059 281059 933insert into idx2 select * from idx1 where rowid in (select rid


from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )






6 rows selected.

如演示4所示，MSSM模式下合并操作与ASSM情况下大致一样，合并操作

可以有效解决该问题。

5． Coalesce合并操作的锁影响SQL> create table coal (t1 int);Table created.

SQL> create index pk_t1 on coal(t1);Index created.

SQL> begin 2 for i in 1..3000 loop 3 insert into coal values(i); 4 commit;


5 end loop; 6 end; 7 /

PL/SQL procedure successfully completed.

SQL> delete coal where t1>500;

2500 rows deleted.

SQL> commit;

Commit complete.

SQL> analyze index pk_t1 validate structure;

Index analyzed. -- 注意 analyze validate操作会 block一切 dml操作

SQL> select blocks,lf_blks,del_lf_rows from index_stats;


8 6 2500 -- 删除后的状态

此时另开一个会话，开始 dml操作：

SQL> update coal set t1=t1+1;

500 rows updated.-- 回到原会话

SQL> alter index pk_T1 coalesce; -- coalesce 未被阻

塞

Index altered.-- 在另一个会话中 commit,以便执行 validate structure


Index analyzed.



8 3 500


-- 显然 coalesce的操作没有涉及有 dml操作的块

在没有 dml操作的情况下：

SQL> truncate table coal;

Table truncated.

SQL> begin 2 for i in 1..3000 loop 3 insert into coal values(i); 4 commit; 5 end loop; 6 end; 7 /

PL/SQL procedure successfully completed.


Index analyzed.



8 6 0

SQL> delete coal where t1>500;

2500 rows deleted.

SQL> commit;

Commit complete.


Index analyzed.



8 6 2500


SQL> alter index pk_t1 coalesce;

Index altered.


Index analyzed.



8 1 0 --没有 dml时，coalesce 操作涉及了所有块

如演示5所示coalesce会避开 dml操作涉及的块，但在coalesec的短暂

间歇出现在索引上有事务的块不会太多。且coalesce操作不会降低索引

高度。

附件是关于 rebuild及 coalesce索引操作的详细描述：

6． Coalesce操作总结

优点：

是一种快速的操作，对整体性能影响最小（not performance

sensitive）。

不会锁表，绕过有事务的索引块。

可以有效解决现有的问题。

不会降低索引高度，引起再次的root split

缺点：

需要针对个别对象，定期执行合并操作；无法一劳永逸地全局地解

决该问题。

7． Linux 10.2.0.4上相关补丁的技术交流

Metalink bug 8286901 note 中叙述了一位用户遇到相同的问题并提交了 SR,当时

oracle support给出了 one-off补丁，但该用户在 apply了该补丁后仍未解决问题。

以下为 note 原文：

It is similar to bug8286901, but after applied patch8286901, still see enq tx contentiona with high "failed probes on index block reclamation"

Issue encountered by customer and Oracle developer (Stefan Pommerenk).


He describes is thus:

"Space search performed by the index splitter can't find space in neighboring

blocks, and then instead of allocating new space, we go and continue to

search for space elsewhere, which manifests itself in block reads from disk,

block cleanouts, and subsequent blocks written due to aggressive MTTR

setting."

"To clarify: the cleanouts are not the problem per se. The culprit seems to

be that the space search performed by the index splitter can't find space in

neighboring blocks, and then instead of allocating new space, we go and

continue to search for space elsewhere, which manifests itself in block reads

from disk, block cleanouts, and subsequent blocks written due to aggressive

MTTR setting. This action has caused other sessions to get blocked on TX

enqueue contention, blocked on the splitting session. Advice was to set 10224


trace event for the splitter for a short time only in order to get

diagnostics as to why the space search rejected most blocks.

> A secondary symptom are the bitmap level 1 block updates, which may or may

not be related to the space search; I've not seen them before, maybe because

I didn't really pay attention :P , but the symptoms seen in the ASH trace

indicate it's the same problem. Someone in space mgmt has to look at it to

confirm it is the same problem."

与该用户进行了mail私下交流，他的回复：

I still have a case open with Oracle. I believe that this is a bug in the Oracle code. The problem is that it has been difficult to create a reproducible test case for Oracle support. My specific issue was basically put on hold pending the results of another customer’s service request that appeared to have had the same issue, (9034788). Unfortunately they couldn’t reproduce the issue in that case either. I believe that there is a correlation between the enq TX –

index contention wait event and a spike in the number of

‘failed probes on index block reclamation. I have specifically

asked Oracle to explain why there is a spike in the ‘failed

probes on index block reclamation’ during the same time frame

as the enq TX index contention wait event, but they have not

answered my question.

I was hoping that some investigation by Oracle Support into the

failed probes metric might get someone on the right track to

discovering the bug. That hasn’t happened though.Hi , Thanks for your sharing . The bug (or specific ktsp behave) is fatal in response time sensitive OLTP env. I would like to ask my customer to coalesce those index


where massive deleted regularly. Thanks for your help again!

Yes, I saw that. I have applied patch 8286901 and set the event

for version 10.2.0.4, but the problem still occurs

periodically. And as I mentioned before, we see a correlation

between enq TX waits and the failed probes on index block

reclamation. Which is why I still think that it is a bug. I

agree that trying to rebuild or coalesce the indexes are simply

attempts to workaround the issue and not solve the root cause.

Early on when I started on this issue I did do some index dumps

and could clearly see that we had lots of blocks with only 1 or

2 records after our mass delete jobs. I have provided Oracle

Support with this information as well as oradump files while

the problem is occurring, but they don’t seem to be able to

find anything wrong so far.

If you are interested in seeing if you are experiencing a high

‘failed probes on index block reclamation’ event run the

query below.

select SS.snap_id,

SS.stat_name,

TO_CHAR(S.BEGIN_INTERVAL_TIME, ‘DAY’) DAY,

S.BEGIN_INTERVAL_TIME,

S.END_INTERVAL_TIME,

SS.value,

SS.value – LAG(SS.VALUE, 1, ss.value) OVER (ORDER BY

SS.SNAP_ID) AS DIFF

from DBA_HIST_SYSSTAT SS,

DBA_HIST_SNAPSHOT S

where S.SNAP_ID = SS.SNAP_ID

AND SS.stat_NAME = ‘failed probes on index block reclamation’

ORDER BY SS.SNAP_ID ;

8. 在 11gr2上的测试


--在最新的 11gr2中进行了测试，仍可以重现该问题（如图单条 insert引起了 6675的

buffer_gets,这是在更大量数据的情况下）。

我们可以猜测 Oracle提供的one-off补丁中可能是为叶块分裂所会扫描

的“空块”附加了一个上限，在未达到上限的情况下扫描仍会发生。而

在主流的公开的发行版本中 Oracle不会引入该补丁的内容。尝试在没有

缓存的情况下引起分裂问题，分裂引起了大约 4000个块的物理读，但该

操作仍在 0.12秒（有缓存是 0.02秒，如图）内完成了（该测试使用普

通 ata 硬盘，读取速度在 100MB/S: Timing buffered disk reads:

306 MB in 3.00 seconds = 101.93 MB/sec）；从 1月 21 日的 ash视

图中可以看到引起 split 的 260 会话处于单块读等待 (db file

sequential read)中,且已等待了43950us约等于 44ms;这与良好 io的经

验值 10ms左右有较大出入；我们可以确信 io性能问题也是引发此叶块

分裂延迟如此显性的一个重要因素。

具体结论

综上所述，MSSM 方式是不能避免索引分裂引起交易超时问题的；而不删除数

据的方案在许多对象上不可行；10.2.0.4 上的 one-off 补丁因为目前仅存在

Linux 版本，可以考虑声请补丁后具体测试（因目前没有补丁所以处于未知状

态）。Coalesce 合并索引是目前既有的最具可操作性且无副作用的解决方案。


Mssm及assm下索引叶块分裂的测试

Technology

Transcript of Mssm及assm下索引叶块分裂的测试