To Be Or Not To Be In the Memory Buffer

To Be or Not To Be

Sometimes when I get pulled in to look at a database, I’ll sit down with the developers or database administrators working on the database and get some background on the database.  This usually includes pointing on the important tables, an explanation of why some design decisions were made, and some pre-conceived notions about where to begin focusing.

I’m going to pass on going through my triage process today, but instead focus on the notion of important tables.  This is an area that I often get to provide some immediate feedback to the client.

The Importance in Being Important

How does one define an important data?

  • The most data? This seems a no-brainer but don’t archive tables tend to usually be pretty large and by being archive, maybe no longer important data.
  • The most recently updated data? Probably not, this could have been the StateProvince tables because Jim in marketing decided that Puerto Rico should be a country now.  Yeah, I worked at a place where some days it was a country and others a territory.
  • The most sensitive data? While a customer may feel the security of their data is the most important aspect of the data in your database.  If I’m out looking into performance issues, it might not be a consideration in this scenario.
  • The most active data? This is usually where I tend to focus.  Data that is being used is probably the most important data in the system.  Especially if a lot of the data is being used.
Finding the Most Active Data

To find the active data, a good place to start looking would be in the memory buffer.  Since data that is most actively being used will hang out there it is a good indicator of what your active data is.  By looking at the data in memory, you can start to make decisions on which tables and indexes to pay a little more attention to.  The trouble with this approach though will be that data used now may not be used later.

But that might be exactly the problem.  If you know that an un-indexed archive table is never used and the entire table is in memory, that may be the source of some problems.  Something had to move out of memory for something to move into it.

Below is the T-SQL Script that I used to determine what is in the buffer.  It is scoped to the database level so you would need to run this multiple times across a few databases to see what is going on for the entire server.

;WITH AllocationUnits
AS (
    SELECT 
        p.object_id
        ,p.index_id
        ,p.partition_number 
        ,au.allocation_unit_id
    FROM sys.allocation_units AS au
        INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3)
    UNION ALL
    SELECT p.object_id
        ,p.index_id
        ,p.partition_number 
        ,au.allocation_unit_id
    FROM sys.allocation_units AS au
        INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2
)
SELECT t.name as table_name
    ,i.name as index_name
    ,i.[type_desc]
    ,au.partition_number
    ,COUNT(*) AS buffered_page_count
    ,CONVERT(decimal(12,2), CAST(COUNT(*) as bigint)*CAST(8 as float)/1024) as buffer_mb
FROM sys.dm_os_buffer_descriptors AS bd 
    INNER JOIN AllocationUnits au ON bd.allocation_unit_id = au.allocation_unit_id
    INNER JOIN sys.all_objects t ON au.object_id = t.object_id
    INNER JOIN sys.indexes i ON au.object_id = i.object_id AND au.index_id = i.index_id
WHERE bd.database_id = db_id()
GROUP BY t.name, i.name,i.[type_desc], au.partition_number
ORDER BY COUNT(*) DESC

HINT: Try running this on MSDB some time right after database backups.  I’ve run into a few cases where backup tables haven’t been cleaned up in ages and the amount of data in those tables bumps a lot of user data out of the buffer while performing backups.  Not the most ideal situation.