Oracle安全 – SCN的可能最大值与耗尽问题

  • Post author:
  • Post category:其他



Oracle安全 – SCN的可能最大值与耗尽问题


在2012年第一季度的CPU补丁中,包含了一个关于SCN修正的重要变更,这个补丁提示,在异常情况下,Oracle的SCN可能出现异常增长,使得数据库的一切事务停止,由于SCN不能后退,所以数据库必须重建,才能够重用。

我曾经在以下链接中描述过这个问题:




http://www.eygle.com/archives/2012/03/oracle_scn_bug_exhaused.html


Oracle使用6 Bytes记录SCN,也就是48位,其最大值是:


SQL> col scn for 999,999,999,999,999,999



SQL> select power(2,48) scn from dual;



SCN

————————

281,474,976,710,656


Oracle在内部控制每秒增减的SCN不超过 16K,按照这样计算,这个数值可以使用大约544年:




SQL> select power(2,48) / 16 / 1024 / 3600 / 24 / 365 from dual;



POWER(2,48)/16/1024/3600/24/365

——————————-

544.770078


然而在出现异常时,尤其是当使用DB Link跨数据库查询时,SCN会被同步,在以下链接中,我曾经描述过此问题:




http://www.eygle.com/archives/2006/11/db_link_checkpoint_scn.html


一个数据库当前最大的可能SCN被称为”最大合理SCN”,该值可以通过如下方式计算:


col scn for 999,999,999,999,999,999

select

(

(

(

(

(

(

to_char(sysdate,’YYYY’)-1988

)*12+

to_char(sysdate,’mm’)-1

)*31+to_char(sysdate,’dd’)-1

)*24+to_char(sysdate,’hh24′)

)*60+to_char(sysdate,’mi’)

)*60+to_char(sysdate,’ss’)

) * to_number(‘ffff’,’XXXXXXXX’)/4 scn

from dual

/


这个算法即SCN算法,以1988年1月1日 00点00时00分开始,每秒计算1个点数,最大SCN为16K。

这个内容可以参考如下链接:




http://www.eygle.com/archives/2006/01/how_big_scn_can_be.html


在CPU补丁中,Oracle提供了一个脚本


scnhealthcheck.sql 用于检查数据库当前SCN的剩余情况




该脚本的算法和以上描述相同,最终将最大合理SCN 减去当前数据库SCN,计算得出一个指标:HeadRoom。也就是SCN尚余的顶部空间,


这个顶部空间最后折合成天数




以下是这个脚本的内容:


Rem

Rem $Header: rdbms/admin/scnhealthcheck.sql st_server_tbhukya_bug-13498243/8 2012/01/17 03:37:18 tbhukya Exp $

Rem

Rem scnhealthcheck.sql

Rem

Rem Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.

Rem

Rem    NAME

Rem      scnhealthcheck.sql – Scn Health check

Rem

Rem    DESCRIPTION

Rem      Checks scn health of a DB

Rem

Rem    NOTES

Rem      .

Rem

Rem    MODIFIED   (MM/DD/YY)

Rem    tbhukya     01/11/12 – Created

Rem

Rem

define LOWTHRESHOLD=10

define MIDTHRESHOLD=62

define VERBOSE=FALSE

set veri off;

set feedback off;

set serverout on

DECLARE

verbose boolean:=&&VERBOSE;

BEGIN

For C in (

select

version,

date_time,

dbms_flashback.get_system_change_number current_scn,

indicator

from

(

select

version,

to_char(SYSDATE,’YYYY/MM/DD HH24:MI:SS’) DATE_TIME,

((((

((to_number(to_char(sysdate,’YYYY’))-1988)*12*31*24*60*60) +

((to_number(to_char(sysdate,’MM’))-1)*31*24*60*60) +

(((to_number(to_char(sysdate,’DD’))-1))*24*60*60) +

(to_number(to_char(sysdate,’HH24′))*60*60) +

(to_number(to_char(sysdate,’MI’))*60) +

(to_number(to_char(sysdate,’SS’)))

) * (16*1024)) – dbms_flashback.get_system_change_number)

/ (16*1024*60*60*24)

) indicator

from v$instance

)

) LOOP

dbms_output.put_line( ‘—————————————————–‘

|| ‘———‘ );

dbms_output.put_line( ‘ScnHealthCheck’ );

dbms_output.put_line( ‘—————————————————–‘

|| ‘———‘ );

dbms_output.put_line( ‘Current Date: ‘||C.date_time );

dbms_output.put_line( ‘Current SCN:  ‘||C.current_scn );

if (verbose) then

dbms_output.put_line( ‘SCN Headroom: ‘||round(C.indicator,2) );

end if;

dbms_output.put_line( ‘Version:      ‘||C.version );

dbms_output.put_line( ‘—————————————————–‘

|| ‘———‘ );

IF C.version > ‘10.2.0.5.0’ and

C.version NOT LIKE ‘9.2%’ THEN

IF C.indicator>&MIDTHRESHOLD THEN

dbms_output.put_line(‘Result: A – SCN Headroom is good’);

dbms_output.put_line(‘Apply the latest recommended patches’);

dbms_output.put_line(‘based on your maintenance schedule’);

IF (C.version < ‘11.2.0.2’) THEN

dbms_output.put_line(‘AND set _external_scn_rejection_threshold_hours=’

|| ’24 after apply.’);

END IF;

ELSIF C.indicator<=&LOWTHRESHOLD THEN

dbms_output.put_line(‘Result: C – SCN Headroom is low’);

dbms_output.put_line(‘If you have not already done so apply’ );

dbms_output.put_line(‘the latest recommended patches right now’ );

IF (C.version < ‘11.2.0.2’) THEN

dbms_output.put_line(‘set _external_scn_rejection_threshold_hours=24 ‘

|| ‘after apply’);

END IF;

dbms_output.put_line(‘AND contact Oracle support immediately.’ );

ELSE

dbms_output.put_line(‘Result: B – SCN Headroom is low’);

dbms_output.put_line(‘If you have not already done so apply’ );

dbms_output.put_line(‘the latest recommended patches right now’);

IF (C.version < ‘11.2.0.2’) THEN

dbms_output.put_line(‘AND set _external_scn_rejection_threshold_hours=’

||’24 after apply.’);

END IF;

END IF;

ELSE

IF C.indicator<=&MIDTHRESHOLD THEN

dbms_output.put_line(‘Result: C – SCN Headroom is low’);

dbms_output.put_line(‘If you have not already done so apply’ );

dbms_output.put_line(‘the latest recommended patches right now’ );

IF (C.version >= ‘10.1.0.5.0’ and

C.version <= ‘10.2.0.5.0’ and

C.version NOT LIKE ‘9.2%’) THEN

dbms_output.put_line(‘, set _external_scn_rejection_threshold_hours=24’

|| ‘ after apply’);

END IF;

dbms_output.put_line(‘AND contact Oracle support immediately.’ );

ELSE

dbms_output.put_line(‘Result: A – SCN Headroom is good’);

dbms_output.put_line(‘Apply the latest recommended patches’);

dbms_output.put_line(‘based on your maintenance schedule ‘);

IF (C.version >= ‘10.1.0.5.0’ and

C.version <= ‘10.2.0.5.0’ and

C.version NOT LIKE ‘9.2%’) THEN

dbms_output.put_line(‘AND set _external_scn_rejection_threshold_hours=24’

|| ‘ after apply.’);

END IF;

END IF;

END IF;

dbms_output.put_line(

‘For further information review MOS document id 1393363.1’);

dbms_output.put_line( ‘—————————————————–‘

|| ‘———‘ );

END LOOP;

end;

/


在应用补丁之后,一个新的隐含参数 _external_scn_rejection_threshold_hours 引入,通常设置该参数为 24 小时:



_external_scn_rejection_threshold_hours=24



这个设置降低了SCN Headroom的顶部空间,以前缺省的设置容量至少为31天,降低为 24 小时,可以增大SCN允许增长的合理空间。

但是如果不加控制,SCN仍然可能会超过最大的合理范围,导致数据库问题。

这个问题的影响会极其严重,我们建议用户检验当前数据库的SCN使用情况,以下是检查脚本的输出范例:


————————————–

ScnHealthCheck

————————————–

Current Date: 2012/01/15 14:17:49

Current SCN:  13194140054241

Version:      11.2.0.2.0

————————————–

Result: C – SCN Headroom is low

If you have not already done so apply

the latest recommended patches right now

AND contact Oracle support immediately.

For further information review MOS document id 1393363.

————————————–


这个问题已经出现在客户环境中,需要引起大家的足够重视。

在SCN告警阈值达到时,数据库中可能出现以下错误信息:


Advanced SCN by 8381 minutes worth to Ox0bad.4ab15e1,by distributed transaction remote logon,remote DB:ORCL.

Warning – High Database SCN: Current SCN value is 0x0b7b.0008e40b, threshold SCN value is 0x0b75.055dc000

If you have not previously reported this warning on this database, please notify Oracle Support so that additional diagnosis can be performed.

Warning: The SCN headroom for this database is only NN days!

Warning: The SCN headroom for this database is only N hours!

Rejected the attempt to advance SCN over limit by 984 hours worth to 0x0c00.0000ff66, by distributed transaction remote logon, remote DB: DB.ORCL.ORACLE.COM.

Client info : DB logon user SYS, machine sun, program sqlplus@orcl (TNS V1-V3), and OS user oracle

Rejected the attempt to advance SCN over limit by 9875 hours worth to 0x0c00.000003e6, by distributed transaction logon, remote DB: DB.ORCL.ORACLE.COM.


MOS参考文档:

NOTE:1376995.1 – Information on the System Change Number (SCN) and how it is used in the Oracle Database

NOTE:1393363.1 – Installing, Executing and Interpreting output from the “SCNhealthcheck.sql” script

NOTE:1388639.1 – Evidence to collect when reporting “high SCN rate” issues to Oracle Support

NOTE:1393360.1 – ORA-19706 and Related Alert Log Messages



11.2.0.2及以上版本才有意义,结果为Y,表示使用的是16K的速率,否则是使用32K速率




  1. SQL&gt


    select


    decode(bitand(DI2FLAG,65536),65536,


    ‘Y’


    ,


    ‘N’


    ) using16


  2. 2

    from


    x$kccdi2;



  3. U



  4. Y




  1. SQL&gt


    select


    decode(bitand(DI2FLAG,65536),65536,


    ‘Y’


    ,


    ‘N’


    ) using16


  2. 2

    from


    x$kccdi2;



  3. U



  4. Y


SCN Headroom:这个是指Maximum Reasonable SCN与当前数据库SCN的差值。在alert中通常是以“天”为单位,这个只是为了容易让人读而已。天数=(Maximum Reasonable SCN-Current SCN)/16384/3600/24。这个值就的意思就是,如果按SCN的每大增长速率,多少天会到达Maximum Reasonable SCN。但实际上即使如此,也不会到达Maximum Reasonable SCN,因为到那时Maximum Reasonable SCN也增大了(越时间增大),要到达Maximum Reasonable SCN,得必须以SCN最大可能速率的2倍才行。





SQL> select

2         ((((

3          ((to_number(to_char(cur_date,’YYYY’))-1988)*12*31*24*60*60) +

4          ((to_number(to_char(cur_date,’MM’))-1)*31*24*60*60) +

5          (((to_number(to_char(cur_date,’DD’))-1))*24*60*60) +

6          (to_number(to_char(cur_date,’HH24′))*60*60) +

7          (to_number(to_char(cur_date,’MI’))*60) +

8          (to_number(to_char(cur_date,’SS’)))

9          ) * (16*1024)) – 12821569053984)

10        / (16*1024*60*60*24)

11        ) headroom

12        from (select to_date(‘2012-05-30 15:09:57′,’yyyy-mm-dd hh24:mi:ss’) cur_date from dual);

HEADROOM

———-

24.1496113