IN2BI, Microsoft Business Intelligence
data, stories & insights
rss | email | twitter

Information about Biml.

Creating a Meta Data Driven SSIS Solution with Biml - 1. Overview

In this series of blog posts I’ll describe and walk you through the creation of a simple meta data driven SSIS solution using Biml.

Biml (Business Intelligence Markup Language) is an XML-based language that allows you to describe your BI Solution in a declarative fashion.

De Biml Package Generator is a new feature of BIDShelper a free add-in for the Microsoft Business Intelligence Development Studio (BIDS). It allows you to create packages based on the Biml documents. Furthermore you can use BimlScript to automate the creation of these packages.

I will use the Northwind sample database as the source for a simple Kimball based data warehouse. (MyDwh) I will also use a staging database (MyDwh_st) and a meta database (MyDwh_meta) that will include a table with a source to target mapping that will drive the creation of SSIS packages.


The series will consists of the following posts:

  1. Overview (This page)
  2. Setting the stage: Primarily a description of the SQL server databases and SQL statements to create the environment.
  3. Creating a TablePackage: Creating a Biml document to create a package to import one table.
  4. Creating the TablePackages: Using Bimclscript to automate the creation of packages
  5. Creating the MasterPackage
  • wo
    dec 14

    Change Data Capture (CDC) and BIML

    Change Data Capture (CDC) is an enterprise feature of SQL Server since version 2008. 12 It's a non not very intrusive way to collect only the changes in selected tables of your source system.3 And thus providing a way to effectively handle your data warehouse load.

    Integration Services (SSIS) and CDC

    To handle these changes you can roll your own SSIS-CDC system where you choose where you save the state of the rows that you handled.4  Since 2012 it's more convenient to use the new CDC components that Microsoft shipped with SQL Server 2012.

    There are three CDC components:

    • CDC Control task is used to control the life cycle of change data capture (CDC) packages. It handles the management of Log Sequence Number (LSN) ranges that are processed in a run of a CDC package.
    • CDC Source, reads a range of change data from change tables and delivers the changes downstream to other SSIS components.
    • CDC Splitter, splits a single flow of rows from a CDC source data flow into different flows of rows for Insert, Update and Delete operations


    These new components are implemented as custom components and aren't natively handled in BIML. So to use them in a BIML solution you have to do some extra work.

    Whenever I create a BIML solution I have a three step approach:

    1. Create a package in SSIS (Or at least a design)
    2. Create a BIML file that can create the package (without script)
      I call this the vanilla version.
    3. Gather meta data if not already available and add script to the BIML file(s) to create all the packages

    For this blog post I'll focus on step 2. Matt Mason of Microsoft already wrote a nice walkthrough that serves well for step 1. Step 3 is something for a future blog post or a sponsor.

    Preparing the environment

    If you didn't follow the walkthrough by Matt Mason you need to create the environment with SQL Server Management Studio:

    1. Create a database (CDCTest)
    2. Create the DimDepartmentGroup_CDC table that will be used as a source.
    3. Enable CDC for the database
    4. Add a primary key to the table created in step 2.
    5. Enable CDC for the DimDepartment_CDC table that was created in step 2
    6. Create a destination table
    7. Create a table to store the cdc state


    If you did follow the walkthrough skip step 1,3 and 7

    The necessary T-SQl code (available at GitHub Gist):

    Creating the BIML file for the initial load

    The initial load package is relatively simple. This is an overview of the resulting package with the CDC Components marked in yellow.

    Control Flow Data Flow
    image image


    The necessary BIML code to create the initial load package
    (available at GitHub Gist):

    The hard part in this code is getting the necessary information for the CDC Control Task. You'll typically find itin the XML code of the vanilla package I described earlier. Open this package by clicking View Code in the context menu (F7) and find the CDC ControlTask


    In this xml snippet you'll find the CreationName, an required attribute of the CustomTask in Biml. The marked ObjectData snippet can be copied into an CData tag within the ObjectData node of the CustomTask.

    Creating the BIML file for the incremental load

    In the incremental load we will use the CDC Data Flow Components. A complicating factor in creating the BIML file is the necessity of meta data of columns. Column information can normally be queried by BIML/SSIS. For these custom components we have to provide them. Before creating the BIML let's look at the resulting package:

    Control Flow Data Flow
    image image

    Some remarks:

    The CDC Control Tasks will typically handle changes of more than one table. For now I've implemented this using a separate sequence container for the DepartmentGroup table. You can create additional containers for extra tables or create several packages. Then you will typically move these control tasks to a master package.

    Changed- and deleted rows are handled in a batch using staging tables. These staging tables are created at the start of the container and dropped at the end.

    The necessary BIML code to create the incremental load package
    (available at GitHub Gist):

    More Information:

    1. MSDN: About Change Data Capture in SQL Server
    2. Whitepaper by Paul Randal: Tracking Changes in your Enterprise Database
    3. SQLCAT Article: Tuning the Performance of Change Data Capture in SQL Server 2008
    4. blog by Ray Barley: Process Change Data Capture in SQL Server Integration Services (2008)
    5. Matt Mason: CDC in SSIS for SQL Server 2012
  • ma
    jul 14

    Viewing or Saving the Compiled Biml File(s)

    Some time ago a BIML enthusiast asked me for the location of the temporary files that BIDSHelper create when you generate the SSIS packages in BIDSHelper. I couldn't help him and explained that the Biml Engine first compiles the Biml documents,  applying any BimlScript in memory and then creates the SSIS packages. The intermediate step isn't persisted on disk.

    Debugging Biml Files Made Easier

    Obviously he needed this intermediate result to better debug his Biml files and improve his development efforts. Recently I learned this simple trick to create the intermediate results he wanted. And I like to share it with you in this blog post:

    • Add a new Biml File to your project and name it SaveCompiledBiml.biml
    • Replace the contents of this file with the next code block

    Some remarks:

    Change the directory and filename of the sFile variable to match your environment / wishes.
    In this example I have used tier 5. If you have files with a higher tier change the template tier directive in the SaveCompiledBiml file.  
    (BIML files are compiled in the order of their "tier", the files with the lowest tier first and those with the highest last. The default tier is 0 for files without BimlScript and 1 for files with BimlScript.)

    Do you want to improve the quality and speed of your ETL-development?

    Biml is quickly gaining popularity as the way to automate the creation of SSIS packages. And thus saving time and money while improving quality. Are you interested in automation of SSIS packages with Biml? Consider my Biml Workshop to get a headstart.

  • zo
    mei 14

    Building a data warehouse while the source systems are still in development

    Some years ago the client of a contractor I worked for made some major investments in their IT-landscape. They replaced their ERP - and CRM system and of course BI/Reporting was part of the acceptance criteria for the new system. It was a big project and a disaster! ERP, CRM and BI consultants from the contractor where running in and out. Discussing requirement details with the client and adapting the systems to these wishes. For the BI team it was especially hard. When we build something on Monday chances were slim that it still worked on Thursday. We depended upon the ERP and CRM team to communicate their changes and provide us with correct test data. And there was no love wasted between the teams. I was very glad when the opportunity arose to leave this war hole and move on. And I did.


    Nevertheless it seems to become a trend to build/adapt the BI system before a new source system has been launched. And make it part of the acceptance criteria for the new system. This of course offers the client an easy way to verify the new system by comparing the reports they work with. In my previous three projects (part) of the source system was still in development and I would probably have gone crazy keeping up with all the changes if I hadn’t been able to automate the work.

    Agility is the name of the game


    In these situations I have found that you (as in an agility drill) need three things:


    Testing Framework

    A testing framework has a complicated ring to it. But if you keep it simple and focus on the results the users expect to see it will be easy and the advantages are great:

    • You’ll notice any errors early in the process.
    • You can correct the errors before the users’ starts testing.
    • You can pinpoint any changes made in the source system that ruin your reports.
    • You’ll gain confident in your solution and sleep better.

    I’ll describe a method that has worked for me. Key to success herein is starting early!

    First get hard copies and digital versions of reports the users are using now. Ask them to highlight the important measures. Start entering the tests in an Excel spreadsheet and use the digital versions of the reports to copy any relevant data) I use the following format where I keep values in the column testName unique

    testName testType criterion1 criterion2 expectedValue expression
    InternetSales201401 DWH 201404 Internet 125.035  

    Now the difficult/laborious part: start entering SQL statement in the expression column that will return the expected value. Use [criterion1] and/or [criterion2] as placeholders in these statements. The values in the criterion1 and criterion2 column will replace these placeholders at execution time. You will now be able to copy the expression to similar test with different values for the criteria. Example expression:

    SELECT CAST(SUM(f.sales) AS int) FROM factSales f INNER JOIN dimDate d ON f.DateKey = d,DateKey INNER JOIN dimChannel ch ON f.ChannelKey = ch.ChannelKey WHERE d.YearMonth = [criterion1] AND ch.ChannelName = ‘[criterion2]’ /* SQL code of example test expression */

    Testing Framework: Tables

    Import your Excel spreadsheet into a SQL Server table with the following definition:

    CREATE TABLE DataTests ( testName nvarchar(64) , testType nvarchar(20) , criterion1 nvarchar(128) , criterion2 nvarchar(128) , expectedValue int , expression nvarchar(1024) ) /* SQL code to create the DataTests table */

    Create an additional table to store the result of the tests that were performed. Use the following definition:

    create table DataTestResults ( testName nvarchar(64) , testDate datetime default getdate() , actualValue int , expectedValue int ) /* SQL code to create the DataTestResults table */

    Testing Framework: SSIS Package

    Now create an SSIS Package that will perform the tests. It uses:

    • An Execute SQL Task to get a list of tests
    • A For Each Loop Container that is used to loop through the list with in it:
    • An Execute SQL Task to execute every test
    • An Execute SQL Task to log the result of every test



    This package can be described with the following BIML (Business Intelligence Markup Language) snippet:

    <Packages> <Package Name="PKG TestData" ConstraintMode="Linear"> <Variables> <Variable Name="TestList" DataType="Object" /> <Variable Name="testQuery" DataType="String" /> <Variable Name="testName" DataType="String" /> <Variable Name="expectedValue" DataType="Int32"> 0 </Variable> <Variable Name="actualValue" DataType="Int32"> 0 </Variable> </Variables> <Tasks> <!--Get list with tests to be performed—> <ExecuteSQL Name="SQL GetTestList" ConnectionName="META" ResultSet="Full"> <DirectInput> SELECT testName , testQuery = REPLACE(REPLACE(expression,'[criterion1]',criterion1),'[criterion2]',criterion2) , expectedValue FROM DataTests WHERE testType='DWH' </DirectInput> <Results> <Result Name="0" VariableName="User.TestList" /> </Results> </ExecuteSQL> <!--Loop through tests and perform them—> <ForEachAdoLoop Name="FELC Test" SourceVariableName="User.TestList" ConstraintMode="Linear" > <VariableMappings> <VariableMapping Name="0" VariableName="User.testName" /> <VariableMapping Name="1" VariableName="User.testQuery" /> <VariableMapping Name="2" VariableName="User.expectedValue" /> </VariableMappings> <Tasks> <!--Perform test—> <ExecuteSQL Name="SQL Execute Test" ConnectionName="DWH" ResultSet="SingleRow" > <VariableInput VariableName="User.testQuery" /> <Results> <Result Name="0" VariableName="User.actualValue" /> </Results> </ExecuteSQL> <!--Log test result—> <ExecuteSQL Name="SQL Log Test Result" ConnectionName="META" ResultSet="None"> <DirectInput> INSERT INTO DataTestResults (testName,actualValue,expectedValue) VALUES (?,?,?) </DirectInput> <Parameters> <Parameter Name="0" Direction="Input" DataType="String" VariableName="User.testName" /> <Parameter Name="1" Direction="Input" DataType="Int32" VariableName="User.actualValue" /> <Parameter Name="2" Direction="Input" DataType="Int32" VariableName="User.expectedValue" /> </Parameters> </ExecuteSQL> </Tasks> </ForEachAdoLoop> </Tasks> </Package> </Packages> */ BIML code to create the testing ssis package */


    In this blog post I discussed the components of an agile system that are necessary when you build a data warehouse where the source systems are still in development:

    • An Alert System to inform you of changes in the source systems.
    • A Data Warehouse Automation Framework.
    • A Testing Framework.

    I ended with a simple implementation of a testing framework that worked for me. Does this help you or did you implement something similar? … I love to hear from you!

  • zo
    dec 13

    Create a LightSwitch Application to Maintain Metadata

    Correct metadata is essential for every automation solution. When I started working with BIML I used one table that describes the source to target table mapping. (see the “Creating a Meta Data Driven SSIS Solution with BIML” series)

    And that works fine in most simple scenarios like importing staging tables or when you are able to move business rules to a view. Key in these simple scenarios is that the columns in your target table have the same name and the same data type.

    Extending the metadata model

    In more recent assignments I added a column mapping table and used T-SQL to simplify the BIML script. With this approach I created more robust SSIS packages that provided extra possibilities, like:

    • Incremental load patterns
    • Converting data types
    • Surrogate key lookups
    • Expressions to calculate new values

    A problem with this approach however is maintaining these column mappings. So I decided to take Visual Studio LightSwitch for a spin and create a simple application to fulfill my maintenance needs.


    Visual Studio LightSwitch is a rapid application development (RAD) tool that is used to help write data-centric line of business (LOB) applications. An import prerequisite of LightSwitch is that you need to have a data model or construct one using the LigthSwitch GUI. Based on this data model you can easily define a set of screens and your application is done. So: No coding necessary. (although coding is possible to extend the standard functionality)

    Data Model

    For my data model I decided to add two additional tables:

    • ETLJob
    • Connection

    From a BIML perspective the ETLJob table is primarily used to create the master package with Execute Package and Execute SQL tasks.  An ETLJob can be either:

    • A generated package (with one or more table mappings)
    • A manual package or a
    • SQL Statement

    So I ended up with the following model:


    The SQL DDL Statements to create this model:


    CREATE TABLE meta.Connection
        , ConnectionName VARCHAR(64) NOT NULL
        , ConnectionType VARCHAR(20) NOT NULL
        , ConnectionString VARCHAR(256) NOT NULL
        , CreateInProject BIT NOT NULL DEFAULT(0)
        , DelayValidation BIT NOT NULL DEFAULT(1)
        , ETLJobName VARCHAR(64) NOT NULL
        , ETLJobType VARCHAR(20) NOT NULL
        , ETLJobGroup VARCHAR(64)
        , SQLConnection INT
            REFERENCES meta.Connection(ConnectionID)
        , SQLStatement VARCHAR(512)
    CREATE TABLE meta.TableMapping
            REFERENCES meta.ETLJob(ETLJobID)
        , TableName VARCHAR(64)
        , TableType VARCHAR(20)
        , TableConnection INT NOT NULL
            REFERENCES meta.Connection(ConnectionID)
        , TableSchema VARCHAR(64)
        , SourceConnection INT
            REFERENCES meta.Connection(ConnectionID)
        , SourceObject VARCHAR(256)
    CREATE TABLE meta.ColumnMapping
        , TableMapping INT NOT NULL
            REFERENCES meta.TableMapping(TableMappingID)
        , ColumnName VARCHAR(64) NOT NULL
        , ColumnType VARCHAR(20) NOT NULL
        , ColumnDataType VARCHAR(20) NOT NULL
        , SourceColumnName VARCHAR(64)
        , SourceColumnDataType VARCHAR(20)
        , Calculation VARCHAR(256)
        , LookupConnection INT 
            REFERENCES meta.Connection(ConnectionID)
        , LookupObject VARCHAR(256)
        , InputColumnName VARCHAR(64)
        , InputColumnDataType VARCHAR(20)
        , OutputColumnName VARCHAR(64)


    Importing the data model in LightSwitch

    To import this data model you only need 6 steps:

    1. Create a new database. (I named it MetaBase) and execute the  afore mentioned DDL Statements to create the objects and relations.
    2. In Visual Studio create a New Project. Select LightSwitch as template and choose LightSwitch Application (Visual C#). Give the project a name (I used Metabase) and click OK.

    3. In the next screen click Attach to External Data Source.

    4. In the Attach Data Source Wizard form select Database and click next.
    5. In the Connection Properties window enter the server name\instance and Select the database you created in step 1. Click OK.

    6. Check Tables and use the default name in Specify the name of the data source in the Choose your Database Objects step. Click Finish.


    You have imported the data model into LightSwitch.


    Changing the data model

    When you look at the LightSwitch designer in your data model you will notice LightSwitch made some small name changes, describing the relations. Do not try to correct these. Instead change the display name in the properties window. As I did for the source connection in this screenshot.



    Creating the screens

    Creating screens is even easier than creating the data model.

    1. Right-Click Screens in Solution Explorer and choose Add Screen ..


    2. In the Add New Screen dialog select Details Screen as template. Select MetaBaseData.ETLJob as Screen Data and check ETLJob TableMapping. Click OK.


    3. After I created the screen I made some small changes in the designer: Moved some fields up or down by dragging and dropping and changed the number of Lines in the SQL Statement field from 1 to 3.


    I repeated steps 1 and 2 to create additional screens:

    Template Screen Data
    Search Data Screen Connections
    Search Data Screen ETLJobs
    Search Data Screen TableMappings
    New Data Screen Connections
    New Data Screen ETLJobs 2)
    New Data Screen TableMappings 1)
    Details Screen Connections
    Details Screen TableMappings 1)

    1) Add ColumnMappings under Additional data to include
    2) Add TableMappings under Additional data to include


    Navigation, Shell and Theme

    Right-Click Screens en select Edit Screen navigation from the context menu. Change the screen navigation to the following image:


    Choose General Properties and change the Shell to LightSwitch Standard Shell and the Theme to LightSwitch Blue Theme.




    With Visual Studio LightSwitch you can easily create an application to maintain the meta data for your BIML solution. You create a normalized data model. Import it in LightSwitch and start adding screens. And if you like BIML you’ll probably like LightSwitch as well: Both make it easier to program your solution.

    An additional advantage of storing the meta data in separate tables and creating an application to maintain that data is that you have a great overview of your data lineage and up to date documentation. Here are some screenshots of the application with data.

    Search ETLJob screen showing the packages and SQL Statements of our solution.

    An ETL Job detail screen with the associated table mappings.

    A Table Mapping detail screen with the associated column mappings.

  • di
    sep 13

    Making your Biml files less complex

    The combination of XML and C# code in a Biml file can make Biml files very complex. In web development Microsoft overcame this problem by introducing the code-behind model with one file for the static (html) text and a separate file for the dynamic code. Unfortunately this is not possible with Biml. There are however some ways to make your files less complex:image

    1. Separate files
    2. Move code logic to T-SQL

    In this post I’ll briefly describe the first option and use some examples to explain the second option.

    Separate files

    You can use the include directive to insert a piece of static xml of another file in your Biml file. John Minkjan has a nice example on his website.
    If you need more dynamic content you can also opt for the CallBimlScript function which will allow you to use properties. Of course you will have to handle these properties in the callee file.

    Move some code logic to T-SQL

    A typical Biml solution exists not only of Biml files but also of metadata that describe the packages that you want to build. I typically use a table mapping table and a column mapping table in a SQL Server database. This allows me to create a stored procedure that combines information from both tables in one dataset with all relevant information in one row.

    For the next examples I will use the following information in this source to target mapping table:


    Using FOR XML PATH(‘’) in T-SQL

    With the “FOR XML PATH(‘’)” syntax you can transform columns into a string. The T-SQL statement:

    SELECT srcColumnList = SELECT  ', ' + srcColumn 
    FROM meta.ColumnMapping WHERE srcColumn is not null
    FOR XML PATH ('')

    returns: , Id, f_name, l_name, sex a string you can almost use as a column list in a source component in the SELECT <column list> FROM <source table> statement. Almost … because you’ll have to remove the first comma and handle strange characters in column names. So you’ll have to do some stuff to remove this comma:

    SELECT srcColumnList = STUFF((SELECT  ', ' + QUOTENAME(srcColumn) FROM meta.ColumnMapping WHERE srcColumn is not null
    FOR XML PATH('')),1,2,'')

    which returns the string we need: [Id], [f_name], [l_name], [sex]

    Creating the data conversion task

    Expanding on the previous statement you can create additional dynamic content for your Biml solution. In this case the column definition in  a Data Conversion Task. In Biml you would write:

    <DataConversion Name="DC">
    <Column SourceColumn="f_name" TargetColumn="FirstName"
                DataType="AnsiString" Length="40" CodePage="1252" />
         <…more columns …>


    To create the column list for this data conversion use the statement:

        (SELECT  char(10) + '<Column SourceColumn="' + srcColumn 
            + '" TargetColumn="' + tgtColumn
            + '" DataType="' + tgtDataType
            + CASE WHEN tgtDataType='AnsiString'
            THEN '" Length="' + CAST(tgtLength AS varchar(10))
                 + '" CodePage="1252" />'
            ELSE '" />' END
         FROM meta.ColumnMapping
         FOR XML PATH('')),1,1,''),'&lt;','<'),'&gt;','>')

    Which returns the Biml string we need:

    <Column SourceColumn="Id" TargetColumn="CustomerID"
         DataType="Int32" />
    <Column SourceColumn="f_name" TargetColumn="FirstName"
         DataType="AnsiString" Length="40" CodePage="1252" />
    <Column SourceColumn="l_name" TargetColumn="LastName"
         DataType="AnsiString" Length="40" CodePage="1252" />
    <Column SourceColumn="sex" TargetColumn="Gender"
        DataType="AnsiString" Length="6" CodePage="1252" />

    Some remarks to the SQL statement:

    • We don’t really need the char(10), but the line break is convenient when we look at the results.
    • Since this query uses the for xml syntax and xml can’t handle the opening en closing tag signs. SQL server replaces them with their escape codes. We have to use the REPLACE function to change these escape codes back into the opening and closing tag signs.


    Creating the stored procedure

    For this example I would create the following stored procedures that combines the two metadata tables:

    CREATE PROCEDURE meta.getPackageDetails (@MappingType varchar(50)) AS
          PackageName = t.MappingName
        , TargetTable = t.tgtSchema + '.' + t.tgtTable
        , TargetConnection = t.tgtConnection
        , SourceTable = t.srcSchema + '.' + t.srcTable
        , SourceConnection = t.srcConnection
        , srcColumnList = STUFF((SELECT  ', ' + QUOTENAME(srcColumn)
            FROM meta.ColumnMapping
            WHERE srcColumn is not null
            AND TableMappingID=t.TableMappingID
            FOR XML PATH('')),1,2,'')
        , DCColumns = REPLACE(REPLACE(STUFF(
            (SELECT  char(10) + '<Column SourceColumn="' + srcColumn
            + '" TargetColumn="' + tgtColumn
            + '" DataType="' + tgtDataType
            + CASE WHEN tgtDataType='AnsiString'
            THEN '" Length="' + CAST(tgtLength AS varchar(10))
                + '" CodePage="1252" />'
            ELSE '" />' END
            FROM meta.ColumnMapping
            WHERE TableMappingID=t.TableMappingID
            FOR XML PATH('')),1,1,''),'&lt;','<'),'&gt;','>')
    FROM meta.TableMapping t
    WHERE t.MappingType = @MappingType

    Less complex Biml file

    The Biml file that you need to create the packages:

    <Biml xmlns="">
      <#@ include file="Connection.biml" #>

        <# string sConn =  “Provider=SQLNCLI10;Server=.\\SQL2012;
           Initial Catalog=BimlSamples;Integrated Security=SSPI;"; #>
        <# string sSQL = string.Format("Exec meta.getPackageDetails
          {0}", "dim"); #>
        <# DataTable tblPackages = ExternalDataAccess.GetDataTable
          (sConn,sSQL); #>
        <# foreach (DataRow pkg in tblPackages.Rows){ #>
            <Package Name="<#=pkg["PackageName"]#>"

            <Dataflow Name="DFT <#=pkg["PackageName"]#>">
                <OleDbSource Name="ODS Source"
                  ConnectionName="<#=pkg["SourceConnection"]#>" >
                    SELECT <#=pkg["srcColumnList"]#>
                    FROM <#=pkg["SourceTable"]#>
                <DataConversion Name="DC">

                <OleDbDestination Name="ODD Target"

    <#@ import namespace="System.Data" #>


    In this blog post I discussed some ways to make you Biml files less complex. You can move (semi)-static content to other files and import them with the include directive or use the CallBimlScript function. And you can move parts of complicated Biml code to T-SQL. The choice of how much you will move to T-SQL will largely depend on your proficiency in C# or T-SQL.

  • za
    aug 13

    8 Practical BIML Tips

    You can leverage Business Intelligence Markup Language – or BIML – to automate the creation of Microsoft SQL Server Integration Services (SSIS) Packages. The popular open source BIDSHelper project includes Biml functionality, enabling anyone to write and execute BIML code for free. Among professional SSIS Developers BIML is quickly gaining popularity.
    In this post I’ll share some practical tips ….

    1. Better copy and paste experience

    The irritating behavior of the XML editor in Visual Studio when copying BIML script and how to overcome it  has been documented on several blogs. In this overview of tips I couldn’t discard it.
    In Visual Studio / SSDT … Select Options in the Tools menu.
    In the Treeview of the Options form expand Text Editor, expand  XML and choose Formatting.
    Next uncheck both options under Auto Reformat.


    2. Put directive tags #@..# at the bottom of your BIML file

    In many examples (including mine) these directive are placed at the top of the BIML file. Which makes sense because this is the default location in software development. However when Visual Studio opens a file with a directive before the <Biml> xml tag it doesn’t use the xml editor and we lose the formatting and intellisense features in the gui.
    So instead place the directives at the bottom of the file. After the closing </Biml> tag. This will not have any effect on the creation of packages.

    3. Vanilla Biml file

    Before using script in a BIML file create a working BIML file that can create a package with more than 80% of the expected functionality.
    Copy this file and use it as a base and then start scripting.
    Why? The Combination of xml and code in one document makes it more complicated to select the correct BIML elements and attributes. Next use a small dataset so when you test your work only a small amount of packages are created.

    4. Test and save often

    During development regularly often check your work. Use the options: “Check Biml for Errors” or “Generate SSIS Packages” from the context menu. This way you not only test your work but save it as well.
    Why? Debugging BIML files is mostly a pain. Error messages are limited and often refer to the wrong row and small typos can have a huge impact. So you better find your errors early in development.

    5. Special XML characters

    Xml has some special characters that you need to enclose in  a CDATA tag or replace the special character with its escape code:

    • double quote " ( &quot; )
    • single quote '  ( &apos; )
    • less than sign  < ( &lt; )
    • greater than sign  >  ( &gt; )
    • ampersand & ( &amp; )

    As an example suppose you have the following BIML:

    <Direct Input>
    SELECT Name FROM dbo.People WHERE Age > 25
    </Direct Input>

    then the xml processor will fail at Age > 25 As a remedy change your BIML into:

    <Direct Input>
    <![CDATA[SELECT Name FROM dbo.People WHERE Age > 25]]>
    </Direct Input>,

    <Direct Input>
    SELECT Name FROM dbo.People WHERE Age &gt; 25
    </Direct Input>

    6. Special C# characters

    C# also as some special characters that you will need to escape with a backslash . Most notably:

    • the backslash itself \ ( \\ )
    • single quote ‘ ( \’ )
    • double quote “ ( \” )

    As an example escaping the backslash in a file location  
    string FileName = “C:\\Dir\\File.txt”;
    or use the verbatim string construction:
    string FileName = @”C:\Dir\File.txt”;

    7. Learn some basic C#

    C# is the principal language of the .NET framework and is widely used for all sorts of programs: Web Services, Web Applications, Windows Form applications, SSIS Scripts, SQL CLR Stored Procedures etc. An investment in learning some C# will pay off. There is an abundant supply of websites and books with relevant information.

    To get you started: read the chapters Basics  and  Flow control  off This tutorial:

    8. Learn from the samples

    Steal / use the samples on:

  • za
    mei 13

    Logging in SSIS with BIML 2 – Logging Package Execution in the Control Flow

    This is the second post in a series on on logging in SSIS. In the first post I described logging package and tasks within a package by using an event handler. In this post I’ll describe a second method: Logging Package Execution in the control flow. It’s a very common pattern:

    • At the start of the package create a record in an audit table.
    • Use the identity of this row in the dataflow and add it to every row you insert in the target table.
    • Use several counter to count the rows passing through the dataflow.
    • At the end of the package update the audit row that we created in the first step with the end time and the row counters.

    Data Lineage

    The most important advantage of this method lies in step 2 by adding a reference to the audit row in every row that we insert into the target table we will be able to track back every row to when and how it was inserted. This has saved my ass several times by be being able to roll back a faulty import 2 weeks ago and repopulate it with the correct data.

    Andy Leonard wrote a nice blog post on how you can set this up in SSIS. In this post I’ll focus on creating a similar package with BIML. My package will look like this:


    The Audit Table

    To store the package execution information we need an audit table. Create it with the following statement:

    CREATE TABLE meta.SsisAudit(
        AuditID int IDENTITY(1,1) NOT NULL
      , PackageName  varchar(50)
      , PackageGUID uniqueidentifier
      , PackageVersionGUID uniqueidentifier
      , ExecutionID uniqueidentifier
      , StartDate datetime
      , Enddate datetime
      , PackageDuration AS DATEDIFF(second, StartDate
            , EndDate) PERSISTED
      , ExtractRowCount int
      , InsertRowCount int
      , UpdateRowCount int
      , IsProcessed bit DEFAULT(0)
        ( AuditID ASC));

    The Stored Procedures

    To log the package execution information we’ll use two stored procedures that write the information to the SsisAudit table. The first one is used at the start of the package and will return the id of the inserted row. The second is used to update this row with the end date and the row counters .

    CREATE PROCEDURE [meta].[uspNewAuditRow]
        @PackageName varchar(50)
      , @PackageGUID uniqueidentifier
      , @PackageVersionGUID uniqueidentifier
      , @ExecutionID uniqueidentifier
      , @StartDate datetime
    INSERT INTO meta.SsisAudit
      ( PackageName
      , PackageGUID
      , PackageVersionGUID
      , ExecutionID
      , StartDate
      , IsProcessed)
      ( @PackageName
      , @PackageGUID
      , @PackageVersionGUID
      , @ExecutionID
      , @StartDate

    /* Return the id of the inserted row*/

    CREATE PROCEDURE [meta].[uspUpdateAuditRow] 
        @ExtractRowCount int
      , @InsertRowCount int
      , @UpdateRowCount int
      , @AuditID int
    UPDATE meta.SsisAudit
    SET EndDate = SYSDATETIME()   
      , ExtractRowCount = @ExtractRowCount
      , InsertRowCount = @InsertRowCount
      , UpdateRowCount = @UpdateRowCount
      , IsProcessed = 1
    WHERE AuditID = @AuditID

    BIML Snippets

    To create the logging facility in an SSIS Package use the following code snippets in the BIMLScript that creates the package:

       <Variable Name="RcExtract"
           Namespace="User" >0</Variable>
       <Variable Name="RcInsert"
          Namespace="User" >0</Variable>
       <Variable Name="RcUpdate"
          Namespace="User" >0</Variable>
       <Variable Name="AuditID"
          DataType="Int32" Namespace="User" >0</Variable>

       <!--Create audit row and get id-->
       <ExecuteSQL Name="SQL Create AuditRow and Get AuditID"
    EXEC [meta].[uspNewAuditRow] ?,?,?,?,?
             <Parameter Name="0" Direction="Input"
                VariableName="System.PackageName" />
             <Parameter Name="1" Direction="Input"
                VariableName="System.PackageID" />
             <Parameter Name="2" Direction="Input"
                VariableName="System.VersionGUID" />
             <Parameter Name="3" Direction="Input"
                VariableName="System.ExecutionInstanceGUID" />
             <Parameter Name="4" Direction="Input"
                VariableName="System.StartTime" />
             <Result Name="0"
                VariableName="User.AuditID" />

       <!--In Dataflow:-->

        <!--Add meta data –>
        <DerivedColumns Name="DC add meta data">
              <Column Name="AuditID"

       <!--Update audit row-->
       <ExecuteSQL Name="SQL Update Audit Row"
    EXEC [meta].[uspUpdateAuditRow] ?,?,?,?
             <Parameter Name="0" Direction="Input"
                VariableName="User.AuditID" />
             <Parameter Name="1" Direction="Input"
                VariableName="User.RcExtract" />
             <Parameter Name="2" Direction="Input"
                VariableName="User.RcInsert" />
             <Parameter Name="3" Direction="Input"
                VariableName="User.RcUpdate" />



    You can leverage Business Intelligence Markup Language – or BIML – to automate the creation of Microsoft SQL Server Integration Services (SSIS) Packages. BIML is a creation of Varigence, Inc. and is available in proprietary products, open source projects, and has been published as an open language specification. The popular open source BIDSHelper project includes Biml functionality, enabling anyone to write and execute Biml code for free.

    Do you need a head start with the automation of SSIS packages: consider my BIML Workshop.

  • zo
    dec 12

    BIML Script To Disable Indexes And Later Rebuild Indexes

    In an earlier post I described the necessity to drop non-clustered indexes before loading data in your data warehouse and rebuilding them afterwards. Later I found a better approach: Instead of dropping the indexes you can just disable them with:
    ALTER INDEX <IndexName> ON <TableName> DISABLE
    The loading process will now perform faster because SQL Server won’t update these indexes when you are loading. After completion of loading you can rebuild them with:
    ALTER INDEX <IndexName> ON <TableName> REBUILD
    The BIML code at the bottom of this post is part of a larger script in which I loop through a dataset that describe a series of packages. In relevant packages this script is added and executed. In the image of the package you can see what happens:
    • With the Execute SQL Task ‘SQL GetIndexList’ I will get a list of statements I need to disable and rebuild the non-clustered indexes of the target table. The result (Full result set) of this SQL statement is stored in an object variable with the name IndexList
    • In the For Each Loop Container ‘FELC Disable Index’ I loop through the rows of this IndexList and use the Execute SQL Task ‘SQL Disable Index’ for each row and thus disabling the relevant non-clustered indexes.
    • Next is the DataFlow Task … which I have kept empty for this purpose.
    • I end with a For Each Loop Container ‘FELC Rebuild Indexes’ in which I rebuild every index I disabled earlier.



    In most data warehouse scenarios this is a viable pattern. However in a scenario were you load relatively few rows in a large table with many non-clustered indexes, the rebuild of these indexes can take very long.

    The Biml Script to create this package:

      <Biml xmlns="">
        <# string TableName="[Production].[ProductCategory]";#>
          <OleDbConnection Name="TargetDatabase"
                      ConnectionString="Data Source=.\SQL2012; Initial Catalog=AdventureWorks2012; Provider=SQLNCLI10.1; Integrated Security=SSPI;"
          <Package Name="PKG Handle Indexes Gracefully" ConstraintMode="Linear">
              <Variable Name="IndexList" DataType="Object" />
              <Variable Name="DisableStatement" DataType="String" />
              <Variable Name="RebuildStatement" DataType="String" />
              <ExecuteSQL Name="SQL GetIndexList" ConnectionName="TargetDatabase" ResultSet="Full">
       DisableStatement = 'ALTER INDEX ' + QUOTENAME(i.Name) + ' ON '
          + QUOTENAME( + ' DISABLE'
       ,RebuildStatement = 'ALTER INDEX ' + QUOTENAME(i.Name) + ' ON '
          + QUOTENAME( + ' REBUILD'
    FROM sys.indexes i
    JOIN sys.objects o
        ON o.object_id=i.object_id
    WHERE i.is_primary_key = 0       
    AND i.is_unique_constraint = 0  
    AND i.is_disabled = 0    
    AND i.type_desc = N'NONCLUSTERED'       
    AND o.TYPE = N'U'  
        + QUOTENAME('<#=TableName#>'
                  <Result Name="0" VariableName="User.IndexList" />
              <!--Loop through indexes and disable them-->
              <ForEachAdoLoop Name="FELC Disable Indexes"
                              ConstraintMode="Linear" >
                  <VariableMapping Name="0"
                                   VariableName="User.DisableStatement" />
                  <ExecuteSQL Name="SQL Disable Index"
                              ResultSet="None" >
                    <VariableInput VariableName="User.DisableStatement" />

              <Dataflow Name="DFT YourDataFlowTask" />
              <!--DataFlow Logic-->
              <!--Loop through indexes and rebuild them-->
              <ForEachAdoLoop Name="FELC Rebuild Indexes"
                              ConstraintMode="Linear" >
                  <VariableMapping Name="1"
                                   VariableName="User.RebuildStatement" />
                  <ExecuteSQL Name="SQL Rebuild Index"
                              ResultSet="None" >
                    <VariableInput VariableName="User.RebuildStatement" />


    You can leverage Business Intelligence Markup Language – or BIML – to automate the creation of Microsoft SQL Server Integration Services (SSIS) Packages. BIML is a creation of Varigence, Inc. and is available in proprietary products, open source projects, and has been published as an open language specification. The popular open source BIDSHelper project includes Biml functionality, enabling anyone to write and execute Biml code for free.

    Do you need a head start with the automation of SSIS packages: consider my BIML Workshop.

  • za
    sep 12


  • wo
    okt 11

    Biml versus CodeSmith and MyGeneration

    After a recent conference a participant pointed me to CodeSmith and MyGeneration as alternatives for BimlScript. And although I don’t have any experience with these tools from their website I understand he is right. You could use these tools in a similar fashion.

    Both are generic code generators and given a document with a certain xml or other format they’ll be able to loop through a dataset and inject values to the original document. So you could use a certain ssis package… Investigate the xml… Add placeholders… Loop through a dataset … replace the placeholders in the ssis package with the information from your dataset … and save the newly created documents.

    One important disadvantage of this approach is the scrutiny of the rather complex xml in an ssis package. Biml offers an easier/cleaner way to describe a package because it doesn’t need to hold design aspect of the BIDS interface of the package.

    An example of Biml with BimlScript: