IN2BI, Microsoft Business Intelligence
data, stories & insights
rss | email | twitter

SSIS Pattern: Drop- and rebuild indexes dynamically

Fact- and dimension tables in our data warehouses are queried often. The queries used in our reports and the queries used for processing SSAS cubes create a large workload of select queries. These “select” queries benefit from indexes and will return results faster with appropriate indexes.

However a lot of indexes are a pain when you load the date into your data warehouse. A good practices is therefore to drop your NONCLUSTERED indexes at the start of loading your dimensions and fact tables and rebuilt them after loading has finished.

The pain of your DBA

Your DBA will typically change these indexes periodically based on his performance tuning efforts of the latest workload. He will be very disappointed when after changing these indexes he is confronted with the old indexes after a rebuild of indexes.

clip_image018So we need some way to:

  • script the currently used indexes
  • store them
  • drop them
  • load the dimensions
  • load our fact tables
  • rebuild the indexes again.

In this blog post I walk you to the creation of the package with this control flow:

But first let’s focus on the SQL script we need to capture the current indexes. Unfortunately you can not use the SQL Function OBJECT_DEFINITION to generate the code necessary to create an index since that object isn’t supported. So we have to get creative.

The next SQL Statement is based on this blogpost by Simon Sabin. It will return a dataset with 5 columns:

  • SchemaName
  • ObjectName
  • IndexName
  • DropIndexStatement
  • CreateIndexStatement

clip_image002

The SQL Statement to return this list of indexes:


SELECT SchemaName = SCHEMA_NAME (o.SCHEMA_ID)
   ,
ObjectName = o.name
  
,IndexName = i.name
  
,DropIndexStatement = 'DROP INDEX [' + i.Name + '] ON [' + SCHEMA_NAME (o.SCHEMA_ID)
   +
'].[' + o.name + ']'
  
,CreateIndexStatement = 'CREATE ' + (CASE i.is_unique WHEN 1 THEN 'UNIQUE ' ELSE '' END)
   +
'NONCLUSTERED INDEX ['
  
+ i.name + '] ON [' + SCHEMA_NAME (o.SCHEMA_ID) + '].[' + o.name + '] ( '
  
+ LEFT(list, ISNULL(splitter-1,LEN(list))) + ' ) '
  
+ ISNULL('INCLUDE ( ' + SUBSTRING(list, indCol.splitter +1, 100) + ' ) ','')
   +
ISNULL('WHERE ' + i.filter_definition,'')
FROM sys.indexes i
JOIN sys.objects o ON i.OBJECT_ID = o.OBJECT_ID
CROSS APPLY
(SELECT splitter = NULLIF(CHARINDEX('|',indexCols.list),0)
,
list
FROM (SELECT list=CAST((
  
SELECT CASE WHEN sc.is_included_column = 1 AND sc.ColPos = 1 THEN '|' ELSE '' END +
    
CASE WHEN sc.ColPos  > 1 THEN ', ' ELSE '' END + name +
           
CASE WHEN sc.is_descending_key=1 THEN ' DESC' ELSE '' END
            FROM
(SELECT sc.is_descending_key
      
, sc.is_included_column
      
, index_column_id
      
, name = '[' + name + ']'
      
, ColPos = ROW_NUMBER() OVER (PARTITION BY sc.is_included_column
          
ORDER BY sc.index_column_id)
        
FROM sys.index_columns  sc
        
JOIN sys.columns c ON sc.OBJECT_ID = c.OBJECT_ID
          
AND sc.column_id = c.column_id
        
WHERE sc.index_id = i.index_id
                   
AND sc.OBJECT_ID = i.OBJECT_ID ) sc
                   
ORDER BY sc.is_included_column, ColPos
                     
FOR XML PATH (''), TYPE) AS VARCHAR(MAX))
   )
indexCols
) indCol
WHERE i.is_primary_key=0
AND i.is_unique_constraint=0
AND i.type_desc=N'NONCLUSTERED'
AND o.TYPE=N'U'
ORDER BY SchemaName, ObjectName, IndexName

The Steps to change/create the master package:

1. Open the master package that you use to load the dimensions and facts.

2. Add Variables:

  • IndexList with Data Type=Object
  • DropIndexStatement with Data Type =String
  • CreateIndexStatement with Data Type =String
    clip_image004


3. Add an Execute SQL Task to the Control Flow

  • Change Name to Get IndexList
  • Change ResultSet to Full result set
  • Select the target connection under Connection
  • Copy the SQL statement (see above) as the SQLStatement
    clip_image006

 

  • Select the tab Result Set
  • Choose Add and use 0 as the Result Name and Select User::IndexList as the Variable Name 
    clip_image008


4. Add a Foreach Loop Container to the Control Flow and double click on it to open the Foreach Loop Editor

  • Under the tab General change Name and Description to Drop each index
  • Under the tab Collection change Enumerator to Foreach ADO Enumerator and select User::IndexList from the dropdown box as the ADO Object source variable clip_image010

 

  • Under the tab Variable Mappings select User::DropIndexStatement and type 3 as Indexclip_image012


5. Drop an Execute SQL Task in the Drop each index container and double click on it to open the Execute SQL Task Editor.

  • Change Name and Description to Drop Index
  • Select the target connection under Connection
  • Select Variable as the SQLSourceType
  • Select User::DropIndexStatement as the SourceVariable clip_image014


6. Add a Foreach Loop Container to the Control Flow and double click on it to open the Foreach Loop Editor

  • Use Create each index as name for the Foreach loop container
  • In the Collection tab of the Foreach Loop Editor select Foreach ADO Enumerator and use the User::IndexList as ADO Object source variable
  • In the Variable Mappings tab Select User::CreateIndexStatement as variable and 4 as index


7. Drop an Execute SQL Task in the Drop each index container and double click on it to open the Execute SQL Task Editor.

  • Change Name and Description to Create Index
  • Select your Connection under Connection
  • Select Variable as the SQLSourceType
  • Select User::CreateIndexStatement as the SourceVariableclip_image016


8. Change the Precedence Constraints to map the example of the afore mentioned control flow.

» Similar Posts

  1. BIML Script To Disable Indexes And Later Rebuild Indexes
  2. TSQL Metadata Queries
  3. SQL2012 Windowing Functions In The Data Warehouse–1. Dimensions

» Trackbacks & Pingbacks

    No trackbacks yet.

» Comments

  1. Remco van Haaren avatar

    Hi Marco,

    Prima... precies waar ik naar op zoek was!

    Had wat performance issues dit index gerelateerd, na een load was de index weer outdated...

    Bedankt.

    Mvgr

    Remco

    Remco van Haaren — April 18, 2012 11:42 AM
  2. Marco Schreuder avatar

    Hoi Remco,

    Goed om te horen dat ik je heb kunnen helpen en leuk om weer iets van je te horen. Hopelijk gaat alles goed.

    Marco Schreuder — April 18, 2012 12:31 PM
  3. SQL guy avatar

    I put that statement in a query and nothing showed up.

    SQL guy — July 1, 2013 3:23 AM

Comments are closed