2014 年 11 月 18 日
监控系统发现一个sql语句的相应时间特别长,消耗的IO资源也非常高,但是cost成本却非常低
2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
3 WHERE A.ATTRID = ‘res_type’
4 AND A.REGION = 23
5 AND B.ITEMID = A.ATTRVALUE
6 AND ROWNUM = 1;
Elapsed: 00:03:18.62
Execution Plan
———————————————————-
Plan hash value: 1650466411
———————————————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 55 | 7 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| IM_RES_TYPE | 1 | 36 | 1 (0)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 4 | 165 | 7 (0)| 00:00:01 | | |
| 4 | PARTITION RANGE SINGLE | | 4 | 76 | 3 (0)| 00:00:01 | 4 | 4 |
|* 5 | TABLE ACCESS FULL | GROUP_SUBS_MEMBER_ATTR | 4 | 76 | 3 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX UNIQUE SCAN | PK_IMRESTYPE | 1 | | 0 (0)| 00:00:01 | | |
———————————————————————————————————————–
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
5 – filter(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)
6 – access(“RES_TYPE_ID”=”A”.”ATTRVALUE”)
Statistics
———————————————————-
8 recursive calls
0 db block gets
469240 consistent gets
469186 physical reads
0 redo size
619 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
那么问题是为什么cost的成本这么低,但是逻辑读确如此之高。
首先这里简单的分析下这个执行计划为什么会有如此高的逻辑读,先来看下面的查询:
2 from tbcs.GROUP_SUBS_MEMBER_ATTR A
3 where A.ATTRID = ‘res_type’
4 AND A.REGION = 23
5 and “A”.”ATTRVALUE” IS NOT NULL
6 ;
COUNT(*)
———-
14
而且这部分数据都在rownum 7千万以上的位置。
CN
———-
72251394
72253121
72261116
72287094
72292151
72296915
72296922
72304758
72333694
72334266
72334281
72334924
72336096
72336103
14 rows selected.
虽然sql语句加上了rownum=1的限制,但是由于tbcs.GROUP_SUBS_MEMBER_ATTR A中(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的满足这个条件的第一行数据在分区段的较后面的block中(全表扫描也是有顺序的)。 此时这个单分区全表扫描 count stopeky的时候也要遍历这个分区的大部分block才能找到满足(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的第一行数据,然后将这行数据同时与IM_RES_TYPE表做nested loop,当发现这行数据满足”RES_TYPE_ID”=”A”.”ATTRVALUE”即终止查询。 如果不满足再去单分区全扫描找第二行满足谓词条件5的数据行,然后再去和IM_RES_TYPE表比对,一直到找到符合条件的数据行为止。
如果全表扫描前几次IO能够扫描的block刚好能够满足谓词条件,则加上rownum限制条件确实是能够减少大部分的IO消耗。
ATTRID
——————————–
Flag1
1 row selected.
sys@CRMDB1>select * from tbcs.GROUP_SUBS_MEMBER_ATTR A where region=23 and attrid=’Flag1′ and ATTRVALUE IS NOT NULL and rownum=1;
1 row selected.
Execution Plan
———————————————————-
Plan hash value: 484799315
——————————————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
——————————————————————————————————————
| 0 | SELECT STATEMENT | | 1 | 41 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE SINGLE| | 2 | 82 | 2 (0)| 00:00:01 | 4 | 4 |
|* 3 | TABLE ACCESS FULL | GROUP_SUBS_MEMBER_ATTR | 2 | 82 | 2 (0)| 00:00:01 | 4 | 4 |
——————————————————————————————————————
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
3 – filter(“ATTRID”=’Flag1′ AND “A”.”ATTRVALUE” IS NOT NULL AND “REGION”=23)
Statistics
———————————————————-
1 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
968 bytes sent via SQL*Net to client
521 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
由于满足谓词5的数据都在分区段的后面的block中,所以这个sql会消耗特别多的IO资源。
下面我们来解释下为什么在单分区全表扫描然后count stopkey时cbo的评估的cost成本只有3,先来看下表的统计信息:
关于GROUP_SUBS_MEMBER_ATTR表的统计信息:
Name of Rows Blocks Blocks Count Row Len Stats Size MM-DD-YYYY
—————————— ————– ————— ———— ——– ——- —— ————– ———-
GROUP_SUBS_MEMBER_ATTR 344,752,080 22,395,19 0 0 41 YES 17,237,604 09-10-2014
Column Distinct Number Number Sample Date
Name Values Density Buckets Nulls Size MM-DD-YYYY
—————————— ———— ———– ——- ———— ————– ———-
REGION 4 .25000000 1 0 17,237,604 09-10-2014
ATTRID 166 .00602410 1 0 17,237,604 09-10-2014
ATTRVALUE 189,682 .00000527 1 60,824,860 14,196,361 09-10-2014
Index Leaf Distinct Number AV Av Cluster Date
Name BLV Blks Keys of Rows LEA Data Factor MM-DD-YYYY
—————————— — ———— ————– ————– ——- ——- ———— ———-
PK_CM_GROUP_MEMBERATTR 3 3,462,820 341,230,660 341,230,660 1 1 108,506,400 09-10-2014
index Column Col Column
Name Name Pos Details
—————————— —————————— —- ——————
PK_CM_GROUP_MEMBERATTR GRPSUBSMEMOID 1 NUMBER(18,0) NOT NULL
ATTRID 2 VARCHAR2(32) NOT NULL
REGION 3 NUMBER(5,0) NOT NULL
**********************************************************
Partition Level
**********************************************************
Partition Number Empty Average Chain Average Global Date
Name of Rows Blocks Blocks Space Count Row Len Stats MM-DD-YYYY
————– ————– ————— ———— ——- ——– ——- —— ———-
P_R_20 107,562,800 7,002,86 0 0 0 41 YES 10-10-2014
P_R_21 65,051,340 4,220,31 0 0 0 41 YES 07-07-2014
P_R_22 89,764,040 5,816,18 0 0 0 41 YES 08-07-2014
P_R_23 77,962,200 5,069,40 0 0 0 41 YES 07-26-2014
P_R_99 0 0,00 0 0 0 0 YES 09-15-2013
对于attrid的num_distinct是166,num_nulls是0,attrvalue的num_distinct是189682,num_nulls是60824860,而且他们的number buckets都是1,也就是没有直方图,此时cbo要根据(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)谓词条件取出满足条件的数据,由于根据统计信息得出大部分的数据都是满足于这个条件的,cbo会认为只需要全表扫描的前几次IO都能够取出满足条件的数据。(关于选择率和cost成本的计算太过于复杂,不属于本篇blog讨论的范围)
单独分离出来这部分查询cbo估算的执行成本也很低,但是消耗的逻辑读还是特别高。
1 row selected.
Execution Plan
———————————————————-
Plan hash value: 484799315
——————————————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
——————————————————————————————————————
| 0 | SELECT STATEMENT | | 1 | 41 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE SINGLE| | 2 | 82 | 2 (0)| 00:00:01 | 4 | 4 |
|* 3 | TABLE ACCESS FULL | GROUP_SUBS_MEMBER_ATTR | 2 | 82 | 2 (0)| 00:00:01 | 4 | 4 |
——————————————————————————————————————
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
3 – filter(“ATTRID”=’Flag1′ AND “A”.”ATTRVALUE” IS NOT NULL AND “REGION”=23)
Statistics
———————————————————-
8 recursive calls
0 db block gets
469266 consistent gets
468242 physical reads
0 redo size
984 bytes sent via SQL*Net to client
521 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
这里由于(“ATTRID”=’Flag1′ AND “A”.”ATTRVALUE” IS NOT NULL AND “REGION”=23)的数据具有较大的倾斜性且没有直方图,导致了cbo全表扫描count stopkey时错误的估算了成本。
下面xiaoyu新建一个表来测试有直方图和没直方图时估算类似rownum=1 and column1=A这类查询的成本区别。
OWNER COUNT(*)
—————————— ———-
OWBSYS_AUDIT 24
MDSYS 4022
QWE 1
PUBLIC 67990
OUTLN 20
CTXSYS 778
OLAPSYS 1442
FLOWS_FILES 26
OWBSYS 4
SYSTEM 1236
ORACLE_OCM 16
EXFSYS 624
APEX_030200 5122
DBSNMP 114
ORDSYS 5026
ORDPLUGINS 20
SYSMAN 7108
APPQOSSYS 10
XDB 2336
ORDDATA 514
XIAOYU 154
SYS 75434
WMSYS 666
23 rows selected.
SQL> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>’TAB01′,method_opt=>’for all columns size 1′);
PL/SQL procedure successfully completed.
SQL> select * from tab01 where rownum=1 and owner=’SYS’;
Execution Plan
———————————————————-
Plan hash value: 4034257318
———————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 98 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
———————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
3 – filter(“OWNER”=’SYS’)
Statistics
———————————————————-
1 recursive calls
0 db block gets
6 consistent gets
45 physical reads
0 redo size
1605 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select * from tab01 where rownum=1 and owner=’XIAOYU’;
Execution Plan
———————————————————-
Plan hash value: 4034257318
———————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 98 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
———————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
3 – filter(“OWNER”=’XIAOYU’)
Statistics
———————————————————-
1 recursive calls
0 db block gets
2576 consistent gets
2673 physical reads
0 redo size
1615 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select * from tab01 where rownum=1 and owner=’IMP’;
no rows selected
Execution Plan
———————————————————-
Plan hash value: 4034257318
———————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 98 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
———————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
3 – filter(“OWNER”=’IMP’)
Statistics
———————————————————-
1 recursive calls
0 db block gets
3359 consistent gets
3263 physical reads
0 redo size
1343 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
这里看出来在num bucket等于1时,全表扫描然后count stop key成本估算都是2,但是由于数据的分布问题实际的逻辑读是有量变的。
PL/SQL procedure successfully completed.
SQL> select * from tab01 where rownum=1 and owner=’SYS’;
Execution Plan
———————————————————-
Plan hash value: 4034257318
———————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 98 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
———————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
3 – filter(“OWNER”=’SYS’)
Statistics
———————————————————-
0 recursive calls
0 db block gets
6 consistent gets
45 physical reads
0 redo size
1605 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select * from tab01 where rownum=1 and owner=’XIAOYU’;
Execution Plan
———————————————————-
Plan hash value: 4034257318
———————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 98 | 25 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 25 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 25 (0)| 00:00:01 | 1 | 40 |
———————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
3 – filter(“OWNER”=’XIAOYU’)
Statistics
———————————————————-
0 recursive calls
0 db block gets
2576 consistent gets
2673 physical reads
0 redo size
1615 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select * from tab01 where rownum=1 and owner=’IMP’;
no rows selected
Execution Plan
———————————————————-
Plan hash value: 4034257318
———————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 98 | 161 (0)| 00:00:02 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 161 (0)| 00:00:02 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 161 (0)| 00:00:02 | 1 | 40 |
———————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
3 – filter(“OWNER”=’IMP’)
Statistics
———————————————————-
0 recursive calls
0 db block gets
3359 consistent gets
3263 physical reads
0 redo size
1343 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
如果收集该列的直方图,此时优化器清楚的知道谓词条件的数据分布,对于这类查询优化器就能够较准确的评估cost的成本。
优化这个sql并不难,有一个思路就是利用小表IM_RES_TYPE去做驱动表驱动大表GROUP_SUBS_MEMBER_ATTR,可以利用attrvalue和attrid的等值条件创建索引来避免对大表的单分区全扫描。
直接创建(ATTRVALUE+ATTRID)的索引。
SELECT A.ATTRVALUE, B.TYPENAME
FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
WHERE A.ATTRID = ‘res_type’
AND A.REGION = 23
AND B.ITEMID = A.ATTRVALUE
AND ROWNUM = 1;
Plan hash value: 1650466411
———————————————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 55 | 7 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| IM_RES_TYPE | 1 | 36 | 1 (0)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 4 | 165 | 7 (0)| 00:00:01 | | |
| 4 | PARTITION RANGE SINGLE | | 4 | 76 | 3 (0)| 00:00:01 | 4 | 4 |
|* 5 | TABLE ACCESS FULL | GROUP_SUBS_MEMBER_ATTR | 4 | 76 | 3 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX UNIQUE SCAN | PK_IMRESTYPE | 1 | | 0 (0)| 00:00:01 | | |
———————————————————————————————————————–
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
5 – filter(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)
6 – access(“RES_TYPE_ID”=”A”.”ATTRVALUE”)
这里默认情况下cbo根据cost成本估算并不会选择先用tbcs.RESOURCE_TYPE B去做驱动表做nested loop关联,这是因为两个sql的执行成本太接近,优化器既有可能选择全表扫描count stopkey,也有可能选择索引扫描count stopkey,但是消耗的IO资源是存在量变的。
2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
3 WHERE A.ATTRID = ‘res_type’
4 AND A.REGION = 23
5 AND B.ITEMID = A.ATTRVALUE
6 AND ROWNUM = 1;
Elapsed: 00:00:00.06
Execution Plan
———————————————————-
Plan hash value: 458037665
—————————————————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
—————————————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 55 | 14 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
|* 2 | TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR | 1 | 19 | 12 (0)| 00:00:01 | 4 | 4 |
| 3 | NESTED LOOPS | | 10 | 55 | 14 (0)| 00:00:01 | | |
| 4 | TABLE ACCESS FULL | IM_RES_TYPE | 1 | 36 | 2 (0)| 00:00:01 | | |
| 5 | PARTITION RANGE SINGLE | | 10 | | 3 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX RANGE SCAN | IND_ATTRVALUE_ATTRID | 10 | | 3 (0)| 00:00:01 | 4 | 4 |
—————————————————————————————————————————–
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
2 – filter(“A”.”REGION”=23)
6 – access(“RES_TYPE_ID”=”A”.”ATTRVALUE” AND “A”.”ATTRID”=’res_type’)
filter(“A”.”ATTRVALUE” IS NOT NULL)
Statistics
———————————————————-
0 recursive calls
0 db block gets
370 consistent gets
0 physical reads
0 redo size
619 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
这里需要强制加上leading(B A)让优化器先以tbcs.RESOURCE_TYPE B去做驱动表,然后驱动GROUP_SUBS_MEMBER_ATTR A表。
我们这里还可以有个想法就是,既然原sql消耗的成本主要在全表扫描取(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的数据,那么我们能否用另外的一个取数据的办法就是通过index range scan的方式了,那我们能否让cbo走(ATTRVALUE+ATTRID)的索引来取数据。
2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
3 WHERE A.ATTRID = ‘res_type’
4 AND A.REGION = 23
5 AND B.ITEMID = A.ATTRVALUE
6 AND ROWNUM = 1;
Elapsed: 00:02:21.41
Execution Plan
———————————————————-
Plan hash value: 485372855
——————————————————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
——————————————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 55 | 11 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | IM_RES_TYPE | 1 | 36 | 1 (0)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 4 | 165 | 11 (0)| 00:00:01 | | |
| 4 | PARTITION RANGE SINGLE | | 4 | 76 | 7 (0)| 00:00:01 | 4 | 4 |
|* 5 | TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR | 4 | 76 | 7 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX FULL SCAN | IND_ATTRVALUE_ATTRID | 98189 | | 6 (0)| 00:00:01 | 4 | 4 |
|* 7 | INDEX UNIQUE SCAN | PK_IMRESTYPE | 1 | | 0 (0)| 00:00:01 | | |
——————————————————————————————————————————–
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
5 – filter(“A”.”REGION”=23)
6 – access(“A”.”ATTRID”=’res_type’)
filter(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL)
7 – access(“RES_TYPE_ID”=”A”.”ATTRVALUE”)
Statistics
———————————————————-
8 recursive calls
0 db block gets
227934 consistent gets
227981 physical reads
0 redo size
619 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
cob采取比较糟糕的index full scan然会回表的方式,并没有采取我们所认为的index range scan的方式来回表,那我们想想为什么优化器不能够采用index range scan的方式了,这里是因为我们创建索引是(ATTRVALUE+ATTRID)的组合索引,在这个索引取数据时由于”A”.”ATTRVALUE” IS NOT NULL并不是一个等值的条件,而这个attrvalue又是前导列,这个导致优化器如果想走index range scan需要走类似的index range scan然后 INLIST ITERATOR迭代的方式,由于attrvalue有很大一部分的null value,这将会导致这部分执行cbo估算时较高,而不选择这种执行计划,进而选择了更糟糕的index full scan回表的方式,由于attrid=’res_type’ 具有很大的倾斜性,刚好这部分数据又在索引的后面的几个leaf block中。
再来想想(ATTRID+ATTRVALUE)复合索引,通过attrid是索引前导列,(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)这个谓词条件利用attrid是索引前导列的复合索引可以很精准的从root到branch再到leaf block,从而在leaf block时index range scan。
2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
3 WHERE A.ATTRID = ‘res_type’
4 AND A.REGION = 23
5 AND B.ITEMID = A.ATTRVALUE
6 AND ROWNUM = 1;
Elapsed: 00:00:00.12
Execution Plan
———————————————————-
Plan hash value: 2801988880
——————————————————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
——————————————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 55 | 9 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | IM_RES_TYPE | 1 | 36 | 1 (0)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 4 | 165 | 9 (0)| 00:00:01 | | |
| 4 | PARTITION RANGE SINGLE | | 4 | 76 | 5 (0)| 00:00:01 | 4 | 4 |
|* 5 | TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR | 4 | 76 | 5 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX RANGE SCAN | IND_ATTRID_ATTRVALUE | 98189 | | 4 (0)| 00:00:01 | 4 | 4 |
|* 7 | INDEX UNIQUE SCAN | PK_IMRESTYPE | 1 | | 0 (0)| 00:00:01 | | |
——————————————————————————————————————————–
Predicate Information (identified by operation id):
—————————————————
1 – filter(ROWNUM=1)
5 – filter(“A”.”REGION”=23)
6 – access(“A”.”ATTRID”=’res_type’)
filter(“A”.”ATTRVALUE” IS NOT NULL)
7 – access(“RES_TYPE_ID”=”A”.”ATTRVALUE”)
Statistics
———————————————————-
8 recursive calls
0 db block gets
10 consistent gets
5 physical reads
0 redo size
619 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
这里我们需要注意的是在表中添加了rownum的限制又添加了部分谓词的条件时,刚好满足这个谓词条件对应的数据块在表段较后面的数据块中,且对应的列没有直方图又存在数据倾斜,此时cbo估算这个全表扫描 count stopkey成本往往是不准确的,当然上面这个例子确实比较特殊,没有直方图且数据有倾斜性,这里借助这个sql case分析只是为了让我们更好的理解nested loop的原理、以及rownum对于表扫描索引扫描时执行计划和成本估算的影响。