Unexpected Full Table Scans in RAC


select SID, sql_id, sql_child_number
from v$session
where event = ‘gc buffer busy’;

After I got the SIDs, I checked the SQL statements issued by the sessions:

select sql_text
from v$sql
where sql_id = ”
and child_number = 24

The results were interesting – all of them were updating or selecting from a single table. Next I checked the execution plan for the SQLs:

select * from table (
dbms_xplan.display_cursor (
‘9h1r67dtjxfbs’,24,’BASIC’
))
/

PLAN_TABLE_OUTPUT
——————————————
EXPLAINED SQL STATEMENT:
————————
select …

Plan hash value: 871755554
———————————————————————————
Id Operation Name
———————————————————————————
0 SELECT STATEMENT

The output tells it all – there is a full table scan going on for a 14 million row table. Well, why? I checked for the presence of indexes and yes, there is an index. Why isn’t the index being used?

This is yet another case of “why my index is not being used”. It happens at a relatively lesser frequency; but happens nevertheless.

Upon examination I discovered that the table is range partitioned and the index is local. Next I checked if the index has become unusable somehow:

select partition_name, status
from dba_ind_partitions
where index_name = ‘the_inex_name’;

PARTITION_NAME STATUS
—————————— ——–

PT_SEP2009 USABLE
PT_OCT2009 UNUSABLE
PMAX UNUSABLE

Note the status of the last two partitions – UNUSABLE. Why so?

I checked the last_ddl_time:

select last_ddl_time
from dba_objects
where object_name = ‘the_index_name’;

And voila! The index was altered today. Why? The answer came quickly: there is a partition maintenance script that splits the PMAX partition to PMAX and a partition named PT_OCT2009. The split worked fine; but the DBA forgot to add a line like:

alter index this_index_name rebuild partition PMAX;

So, the partition remained unusable. Remember, this is a local index; so the clause MAINTAIN GLOBAL INDEXES will not do anything. The partition must be explicitly rebuilt. Since the partition was not usable, the query plan suddenly changed to full table scan, as I saw earlier. Since this caused a lot of buffer gets, the buffer pool of all the instances of the RAC cluster got flloded with the blocks of this table, causing gc buffer busy waits. So, even though the wait event is RAC related, the issue was due to abnormally large buffer gets and was limited to a specific application only. Eventually, the issue would have saturated the interconnect and affected other apps; but since we caught it early, that event was averted.

I rebuilt the index partitions by explicitly issuing the alter index commands. After that, the queries started using the index and, of course, the gc waits disappeared.

Lessons:

(1) just because you see “gc” waits, don’t automatically assume it is RAC specific issue. The underlying issue may have manifested itself as the RAC component issues.
(2) Add to the list of “why oracle is not using my index” one more: unusable index partition(s)
(3) When you split partition of a table, in addition to the clause “maintain global indexes”, issue the rebuild of the split partitions of local indexes.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s