Updating Index, Really?

Oracle doesn’t generally care where or in what order data is stored in heap tables, unless the table is partitioned or clustered. Therefore, when an update operation is performed on a column value, Oracle can basically make the change “in place” within the table block. If the value of a name column changes from say “BOWIE” to a new value of “FOOTE”, Oracle can simply just make the necessary change to the specific table block. Only if the new value is larger and there’s not sufficient space within the current table block for the new value does the picture get a little more complicated, with Oracle having to migrate the row.

However, the story isn’t quite so simple for corresponding changes to index column values. The key difference (no pun intended) is that unlike most heap tables, the location and order of the data within the index is very much an issue. All index entries must be and must always be in the order of the indexed columns. Otherwise, it would be impossible for Oracle to efficiently navigate down the index structure to find within the index all the necessary indexed values required for an index range scan. Imagine for one moment how difficult it would be to find someones phone details if the telephone book wasn’t ordered on names or to use an index in the back of a reference book if the index wasn’t ordered.

The order of an index is crucial and must always be in the order of the indexed column values. Therefore, if a specific indexed value of say “BOWIE” is updated to a new value of “FOOTE”, Oracle can’t simply make the change “in place” within the leaf block as this would result in the index order no longer being maintained. The new value of “FOOTE” would be surrounded by index values starting with “B” and there would be no easy way to subsequently find it within the index. It would likely not even be in the same index leaf block as those other index entries beginning with “F” where the new value should actually reside.

Therefore, Oracle doesn’t actually “update” an index value. The old index entry is marked as deleted and the a new index entry is inserted in the appropriate location within the index to ensure the index order is maintained. So the update of an index value is effectively a delete operation followed by an insert.

There’s no such things as an “update” operation as such on a index and that an update is effectively a delete followed by an insert operation.
Oracle only marks index entries as deleted and doesn’t physically delete the index entry at the time of the transaction.

This leads some folk into (incorrectly) thinking indexes that experience lots of update (or delete) operations need to be frequently rebuilt as the delete space might accumulate and waste space within the index structure over time .
deleted space is actually generally automatically reused by Oracle. For example, all it takes is one subsequent tiny little insert operation into a leaf block for all deleted entries within the leaf block to be automatically cleaned out by Oracle.

Therefore just because an index experiences lots of update activity doesn’t necessarily mean the index needs to be rebuilt. Generally, all deleted space is reused and is effectively nothing but free and available space within the index.
Let’s first just create a simple little table that will have two numeric columns:

SQL> create table bowie (id number, value number);

Table created.

Now let’s populate the first column with a monotonically increasing unique identifier and the second column with a random number:

SQL> insert into bowie select rownum, ceil(dbms_random.value(0,50000))
from dual connect by level <=50000;

50000 rows created.

SQL> commit;

Commit complete.

Now we’ll create an index on the value column that is populated with a bunch of random numbers:

SQL> create index bowie_idx on bowie(value);

Index created.

Let’s collect some statistics and see how much space is being used by this “freshly” created index:

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used, del_lf_rows from index_stats;

—— ——- ——- ——– ———–
128 110 888032 90 0
Specifically note the number of leaf blocks and the overall btree space usage of the index. Also note that as the index has only just been created, we currently have no deleted index entries.

OK, next we’re going to update the index column with a new random number, one row at a time, for effectively 50% of all rows in the entire table. The table currently has 50,000 rows and we will update every other row (that’s 25,000 rows for those mathematically challenged) with a new random value. Now that will be effectively 25,000 separate delete operations and 25,000 separate insert operations.

There are some folk who would think such an exercise would result in 25,000 deleted index entries which need to be cleaned up at some point via an index rebuild.

There are some folk who would think updating 50% of all index column values would result in so much wasted space that the index would grow in an inefficient manner in order to store all these deleted index entries, likely causing performance issues.

There are some folk who think that such a regular and high proportion of updates on an index column in a table is a clear indication that such an index should be rebuilt on a regular basis.

Well, let’s see what happens. First, let’s update every other row in the entire table with a new random number for the value column:

SQL> begin
2 for i in 1..25000 loop
3 update bowie set value = ceil(dbms_random.value(0,50000)) where id = i*2;
4 commit;
5 end loop;
6 end;
7 /

PL/SQL procedure successfully completed.

Let’s see how badly the index has been impacted:

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used, del_lf_rows from index_stats;

—— ——- ——- ——– ———–
128 110 888032 90 60
How interest !!

The first thing to note is that there aren’t actually 25,000 deleted index entries at all, even though we’ve just updated 25,000 index entries. There are just 60 deleted entries currently in the index. Just 60, that’s it !!

Why ?

Because the index is effectively just a random based index and while we may update (and hence effectively delete) an index entry from a specific leaf block, at some later time we’re likely to insert another index entry into this same leaf block, thereby cleaning out any deleted entries it may contain. Effectively, almost all the deleted index entries are being automatically cleaned out by subsequent random inserts throughout the index structure.

The relative handful of currently marked deleted index entries (60) happen to exist in leaf blocks that have not had a subsequent insert since the last delete operation in the specific leaf block. But even these deleted entries will eventually be cleaned out and the space reused by any other subsequent inserts in the specific leaf blocks.

The deleted space is simply not an issue, the vast majority of it has been cleaned out and reused and those entries that haven’t yet been cleaned out will likely be reused by subsequent insert operations anyways.

If we look at the actual space used by the index, after 50% of all index entries have been updated and we note that the index has not changed at all. It has exactly the same number of leaf blocks and is using exactly the same amount of btree space. (Note: because the index values are random, it’s likely that the new values will not be exactly distributed as it was previously and there might be the possibility that the odd index leaf block could fill and split as it contains more associated values than previously. Regardless, as the index grows with more index entries, this is not going to be an issue anyways).

There is nothing “wrong” or inefficient or fragmented with this index, even though we’ve just updated (and hence deleted) 50% of all its index entries.

With deleted index entries being trivial and likely to be reused at some later time anyways and with the index having the same leaf blocks and btree space as it did when the index was newly created, rebuilding such an index would be a total and utter waste of time and resources.

Again, just because an index column is frequently updated, it doesn’t necessarily mean the index is a candidate for a periodic index rebuilds. Any such suggestions are simply misguided …


reference: richardfoote.wordpress.com


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s