The paper is written by Craig A. Shallahamer (email@example.com). Thanks to the author for sharing such an interestng information.
Author has explained the algorithm for managing buffers in Oracle buffer cache. Now obviously all the buffers cannot stay in buffer cache because of the limited size of cache. So buffers which are older or non-popular should go out of the buffer cache. The algorithm is all about the criteria for buffer cache to stay or leave buffer cache.
Initially with older release, Oracle started with LRU (Least Recently Used) algorithm. Conceptually LRU list is nothing but a list of pointers to buffer blocks. When a buffer block is touched, the pointer moves towards the most recently end of LRU chain. Remember that its the pointer which moves up the chain, buffer blocks are never moved. The general idea is to keep the more popular blocks in buffer cache and ask less popular block to leave.
But this algorithm was having some issues. For example a full table scan !! This will get all the buffers into the buffer cache making other important buffers to leave the cache. For example if the buffer cache is having 500 blocks and full table scan is getting 600 blocks in buffer cache, all the popular blocks will go away.
To overcome this problem, Oracle came with a modified LRU algorithem. This modified LRU algorithm takes care of full table scan by keep the buffers of full table scan at the LRU end of LRU chain and also it will allow only limited number of blocks to be put in LRU end of LRU chain. This will avoid flooding buffer cache with huge number of buffers from full table scan. Also along with this algorithm, Oracle implemented multiple buffer pools – KEEP BUFFER POOL, RECYCLE BUFFER POOL and DEFAULT BUFFER POOL. As you must be knowing, the data which is used frequently and should remain in buffer cache should be placed in KEEP BUFFER POOL, buffer that should not stay for longer in buffer cache should be placed in RECYCLE BUFFER POOL and if we dont specifically mention any buffer pool duing table creation, it will go in DEFAULT POOL.
With this background lets move towards understanding the touch count algorithm.
Buffer cache Management
Before understanding the complex logic, we should understand the needs for touch count algorithm. Why does Oracle needs to have such a complex alogrithm in place? We have already seen that modified LRU algorithm takes care of full table scan. But if you think again we have another issue – Large index range scan? Imaging large number of index leaf blocks flowing into the buffer cache. This issue wont be addressed by modified LRU.
Also with growing sizes for buffer cache, better and better performance requirements and more control Oracle introduced touch count algorithm.
For smooth and effective buffer cache operation, a very fast and very flexible algorithm must be implemented that essentially forces every buffer to seriously earn the right to remain in buffer cache. Touch count algorithm makes it very difficult for buffer to simply remain in buffer cache. It’s almost cruel how many hurdles a buffer must continually jump to simply remain in buffer cache.
At the core of touch count algorithm are the harsh requirements placed on each buffer to not only remain in cache, but to remain in MRU end of LRU list.Conceptually touch count algorithm does this by assigning each buffer a counter. Each time a block is touched its counter is incremented. The value of this counter is used to assign popularity to the blocks. But its not staright forward. Keep reading !!
Touch Count Algotrithm
Oracle does lot of nasty things to make it very difficult for a buffer to remain in buffer cache. Lets understand each concepts
Each LRU is divided into two basic areas or region; a hot region and a cold region. All buffers in hot region is called hot buffers and all buffers in cold region is called cold buffers.There is a midpoint marker between hot region and cold region. This mid point marker moves in order to make sure that correct number of buffers are present in each regions. This mid pointer is not associated with any buffer.
By default Oracle divides LRU equally between hot and cold region. That means, 50% of the buffers will be in hot region and 50% will be in cold region. However we can change the default setting by changing a hidden parameter _db_percent_hot_default. If we increase this parameter then buffers in hot region will increase, that is buffers above mid point wil increase.
When a server process reads a block from disk into the buffer cache, it is placed in the middle of LRU chain, that is, between hot region and cold region. This is called mid-point insertion and is the fundamental concept in touch count algorithm. These buffer has to earn there way into hot region by increasing the touch count. Frequent access to these block will increase there touch count and block will move in the hot region.
Touch Count Increment
Now we know that when a server process reads a new block from disk into the buffer cache, where exactly its going to put the block. Lets now consider how the tocuh count increases for blocks.
Therotically when ever a buffer is touched, its touch count should increase. but practically Oracle does not let that happen. Think of a situation where some processes needs some blocks in buffer very frequently for some period, and after that period that block is not so required. In that case Oracle process might access the block so frequently that within a second its touch count will grow huge and it will become eligible to be placed in hot region. Not only that, it might become eligible to stay in hot region. Since this block wont be used later, we dont want this block to stay just because it was used for initial period of time.
To overcome this problem, Oracle only allows buffer’s touch count to be incremented, at most, once every 3 seconds. Again, thi 3 scond is default timing by Oracle, which can be changed using hidden parameter _db_aging_touch_time.
When a touch count is incremented buffer pointer should move. But movement of buffer pointer is independent of touch count increment. Also for any activity in memory area oracle needs a latch for assuring cache consistency. But there is an exception here !! For updating touch count, Oracle does not use latch and buffer block can be modified while touch count is getting incremented. But more interesting is that, two processes may increment the touch count to same value, and when this happens Oracle assures the worst that could happen is the touch count is not actually incremented every time a buffer is touched and that no cache corruption will result.
As mentioned previously, when a buffer is brought into the buffer cache its placed in the middle of hot region and cold region. Unlike LRU algorithm, touch count algorithm will not move the block to hot region just because its touched. Yes, its touch count will probably be incremented.
When a server process if looking for the free buffer to place the content of datafile block into buffer cache or when the DBWR is looking for dirty buffer to write to disk, and if the buffer’s touch count is observed to be greater then 2, its moved to MRU end of LRU list. This default threshold of block movement is controlled by hidden parameter _db_aging_hot_criteria.
Oracle touch count implementation is tough on buffers !! When buffer is moved to MRU end of LRU list, its touch count is set to zero. So this buffer which is newly brought into the hot region has hit touch count reset to 0 immediately. For this block to remain in the hot region its touch count should be incremented significantly. So if this block is really accessed frequenctly, its touch count will increase automatically and it will servive in hot region.
Hot and Cold Movement
If a buffer is moved from cold region to hot region, the mid point marker has to shift to accomodate correct number of blocks in hot a cold region. So 1 block from hot region will be forced into the cold region which is least frequently used and which belongs to LRU end of LRU list. When this happens the touch count of that block is reset to 1. Even if the buffer’s touch count is 250, after moving to cold region its touch count is reset to 1. This threshold crossing touch count reset value is controlled by the instance parameter _db_aging_cool_count. This means the buffer must be all touched again to make its move to the hot region. This is how the blocks are moved and managed in buffer cache.
For more information on performance tuning buffer cache using the above parameters check the reference section below.
ALL ABOUT ORACLE’S TOUCH COUNT ALGORITHM – Criag A. Shallahamer (Version 4a, January 5, 2004)