Viewing or Saving the Compiled Biml

Some time ago a BIML enthusiast asked me for the location of the temporary files that BIDSHelper create when you generate the SSIS packages in BIDSHelper. I couldn’t help him and explained that the Biml Engine first compiles the Biml documents,  applying any BimlScript in memory and then creates the SSIS packages. The intermediate step isn’t persisted on disk.

Debugging Biml Files Made Easier

Obviously he needed this intermediate result to better debug his Biml files and improve his development efforts. Recently I learned this simple trick to create the intermediate results he wanted. And I like to share it with you in this blog post:

  • Add a new Biml File to your project and name it SaveCompiledBiml.biml
  • Replace the contents of this file with the next code block
<Biml xmlns=”https://web.archive.org/web/20160825182845/http://schemas.varigence.com/biml.xsd”>
<#
string xBiml = RootNode.GetBiml();
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xBiml);
string sFile = @”C:\temp\myBimlFile.xml”;
xmlDoc.Save(sFile);
#>
</Biml>
<!–Directives:–>
<#@ template tier=”5″ #>
<#@ import namespace=”Varigence.Hadron.CoreLowerer.SchemaManagement” #>
<#@ import namespace=”System.Xml” #>
<!–BIML Code to save the compiled BIML to file: SaveCompiledBiml.biml–>

view rawSaveCompiledBiml hosted with ❤ by GitHub

Some remarks:

Change the directory and filename of the sFile variable to match your environment / wishes.
In this example I have used tier 5. If you have files with a higher tier change the template tier directive in the SaveCompiledBiml file.  
(BIML files are compiled in the order of their “tier”, the files with the lowest tier first and those with the highest last. The default tier is 0 for files without BimlScript and 1 for files with BimlScript.)

Do you want to improve the quality and speed of your ETL-development?

Biml is quickly gaining popularity as the way to automate the creation of SSIS packages. And thus saving time and money while improving quality. Are you interested in automation of SSIS packages with Biml? Consider my Biml Workshop to get a headstart.

Building a data warehouse while the source systems are still in development

Some years ago the client of a contractor I worked for made some major investments in their IT-landscape. They replaced their ERP – and CRM system and of course BI/Reporting was part of the acceptance criteria for the new system. It was a big project and a disaster! ERP, CRM and BI consultants from the contractor where running in and out. Discussing requirement details with the client and adapting the systems to these wishes. For the BI team it was especially hard. When we build something on Monday chances were slim that it still worked on Thursday. We depended upon the ERP and CRM team to communicate their changes and provide us with correct test data. And there was no love wasted between the teams. I was very glad when the opportunity arose to leave this war hole and move on. And I did.

Trending

Nevertheless it seems to become a trend to build/adapt the BI system before a new source system has been launched. And make it part of the acceptance criteria for the new system. This of course offers the client an easy way to verify the new system by comparing the reports they work with. In my previous three projects (part) of the source system was still in development and I would probably have gone crazy keeping up with all the changes if I hadn’t been able to automate the work.

Agility is the name of the game

image

In these situations I have found that you (as in an agility drill) need three things:

Testing Framework

A testing framework has a complicated ring to it. But if you keep it simple and focus on the results the users expect to see it will be easy and the advantages are great:

  • You’ll notice any errors early in the process.
  • You can correct the errors before the users’ starts testing.
  • You can pinpoint any changes made in the source system that ruin your reports.
  • You’ll gain confident in your solution and sleep better.

I’ll describe a method that has worked for me. Key to success herein is starting early!

First get hard copies and digital versions of reports the users are using now. Ask them to highlight the important measures. Start entering the tests in an Excel spreadsheet and use the digital versions of the reports to copy any relevant data) I use the following format where I keep values in the column testName unique

testNametestTypecriterion1criterion2expectedValueexpression
InternetSales201401DWH201404Internet125.035 

Now the difficult/laborious part: start entering SQL statement in the expression column that will return the expected value. Use [criterion1] and/or [criterion2] as placeholders in these statements. The values in the criterion1 and criterion2 column will replace these placeholders at execution time. You will now be able to copy the expression to similar test with different values for the criteria. Example expression:SELECT CAST(SUM(f.sales) AS int) FROM factSales f INNER JOIN dimDate d ON f.DateKey = d,DateKey INNER JOIN dimChannel ch ON f.ChannelKey = ch.ChannelKey WHERE d.YearMonth = [criterion1] AND ch.ChannelName = ‘[criterion2]’ /* SQL code of example test expression */

Testing Framework: Tables

Import your Excel spreadsheet into a SQL Server table with the following definition:CREATE TABLE DataTests ( testName nvarchar(64) , testType nvarchar(20) , criterion1 nvarchar(128) , criterion2 nvarchar(128) , expectedValue int , expression nvarchar(1024) ) /* SQL code to create the DataTests table */

Create an additional table to store the result of the tests that were performed. Use the following definition:create table DataTestResults ( testName nvarchar(64) , testDate datetime default getdate() , actualValue int , expectedValue int ) /* SQL code to create the DataTestResults table */

Testing Framework: SSIS Package

Now create an SSIS Package that will perform the tests. It uses:

  • An Execute SQL Task to get a list of tests
  • A For Each Loop Container that is used to loop through the list with in it:
  • An Execute SQL Task to execute every test
  • An Execute SQL Task to log the result of every test
image

This package can be described with the following BIML (Business Intelligence Markup Language) snippet:

<Packages>
<Package Name="PKG TestData" ConstraintMode="Linear">
<Variables>
<Variable Name="TestList" DataType="Object" />
<Variable Name="testQuery" DataType="String" />
<Variable Name="testName" DataType="String" />
<Variable Name="expectedValue" DataType="Int32"> 0 </Variable>
<Variable Name="actualValue" DataType="Int32"> 0 </Variable>
</Variables>
<Tasks>
<!--Get list with tests to be performed—>
<ExecuteSQL Name="SQL GetTestList" ConnectionName="META" ResultSet="Full">
<DirectInput>
SELECT testName
, testQuery = REPLACE(REPLACE(expression,'[criterion1]',criterion1),'[criterion2]',criterion2) , expectedValue
FROM DataTests
WHERE testType='DWH'
</DirectInput>
<Results>
<Result Name="0" VariableName="User.TestList" />
</Results>
</ExecuteSQL>
<!--Loop through tests and perform them—>
<ForEachAdoLoop Name="FELC Test" SourceVariableName="User.TestList" ConstraintMode="Linear" >
<VariableMappings>
<VariableMapping Name="0" VariableName="User.testName" />
<VariableMapping Name="1" VariableName="User.testQuery" />
<VariableMapping Name="2" VariableName="User.expectedValue" />
</VariableMappings>
<Tasks>
<!--Perform test—>
<ExecuteSQL Name="SQL Execute Test" ConnectionName="DWH" ResultSet="SingleRow" >
<VariableInput VariableName="User.testQuery" />
<Results>
<Result Name="0" VariableName="User.actualValue" />
</Results>
</ExecuteSQL>
<!--Log test result—>
<ExecuteSQL Name="SQL Log Test Result" ConnectionName="META" ResultSet="None">
<DirectInput>
INSERT INTO DataTestResults (testName,actualValue,expectedValue) VALUES (?,?,?)
</DirectInput>
<Parameters>
<Parameter Name="0" Direction="Input" DataType="String" VariableName="User.testName" />
<Parameter Name="1" Direction="Input" DataType="Int32" VariableName="User.actualValue" />
<Parameter Name="2" Direction="Input" DataType="Int32" VariableName="User.expectedValue" />
</Parameters>
</ExecuteSQL>
</Tasks>
</ForEachAdoLoop>
</Tasks>
</Package>
</Packages>
*/ BIML code to create the testing ssis package */

Conclusion

In this blog post I discussed the components of an agile system that are necessary when you build a data warehouse where the source systems are still in development:

  • An Alert System to inform you of changes in the source systems.
  • A Data Warehouse Automation Framework.
  • A Testing Framework.

I ended with a simple implementation of a testing framework that worked for me. Does this help you or did you implement something similar? … I love to hear from you!

Making your Biml files less complex

Divide and Conquer

The combination of XML and C# code in a Biml file can make Biml files very complex. In web development Microsoft overcame this problem by introducing the code-behind model with one file for the static (html) text and a separate file for the dynamic code. Unfortunately this is not possible with Biml. There are however some ways to make your files less complex:

  1. Separate files
  2. Move code logic to T-SQL

In this post I’ll briefly describe the first option and use some examples to explain the second option.


Separate files

You can use the include directive to insert a piece of static xml of another file in your Biml file. John Minkjan has a nice example on his website.
If you need more dynamic content you can also opt for the CallBimlScript function which will allow you to use properties. Of course you will have to handle these properties in the callee file.


Move some code logic to T-SQL

A typical Biml solution exists not only of Biml files but also of metadata that describe the packages that you want to build. I typically use a table mapping table and a column mapping table in a SQL Server database. This allows me to create a stored procedure that combines information from both tables in one dataset with all relevant information in one row.

For the next examples I will use the following information in this source to target mapping table:

image


Using FOR XML PATH(‘’) in T-SQL

With the “FOR XML PATH(‘’)” syntax you can transform columns into a string. The T-SQL statement:

SELECT srcColumnList = SELECT  ‘, ‘ + srcColumn 
FROM meta.ColumnMapping WHERE srcColumn is not null
FOR XML PATH (”)

returns: , Id, f_name, l_name, sex a string you can almost use as a column list in a source component in the SELECT <column list> FROM <source table> statement. Almost … because you’ll have to remove the first comma and handle strange characters in column names. So you’ll have to do some stuff to remove this comma:

SELECT srcColumnList = STUFF((SELECT  ‘, ‘ + QUOTENAME(srcColumn) FROM meta.ColumnMapping WHERE srcColumn is not null
FOR XML PATH(”)),1,2,”)

which returns the string we need: [Id], [f_name], [l_name], [sex]


Creating the data conversion task

Expanding on the previous statement you can create additional dynamic content for your Biml solution. In this case the column definition in  a Data Conversion Task. In Biml you would write:

<DataConversion Name=”DC”>
  <Columns>
    <Column SourceColumn=”f_name” TargetColumn=”FirstName”
            DataType=”AnsiString” Length=”40″ CodePage=”1252″ />
     <…more columns …>
</Columns>
</DataConversion>

To create the column list for this data conversion use the statement:

SELECT DCColumns = REPLACE(REPLACE(STUFF(
    (SELECT  char(10) + ‘<Column SourceColumn=”‘ + srcColumn 
        + ‘” TargetColumn=”‘ + tgtColumn
        + ‘” DataType=”‘ + tgtDataType
        + CASE WHEN tgtDataType=’AnsiString’
        THEN ‘” Length=”‘ + CAST(tgtLength AS varchar(10))
             + ‘” CodePage=”1252″ />’
        ELSE ‘” />’ END
     FROM meta.ColumnMapping
     FOR XML PATH(”)),1,1,”),’&lt;’,'<‘),’&gt;’,’>’)

Which returns the Biml string we need:

<Column SourceColumn=”Id” TargetColumn=”CustomerID”
     DataType=”Int32″ />
<Column SourceColumn=”f_name” TargetColumn=”FirstName”
     DataType=”AnsiString” Length=”40″ CodePage=”1252″ />
<Column SourceColumn=”l_name” TargetColumn=”LastName”
     DataType=”AnsiString” Length=”40″ CodePage=”1252″ />
<Column SourceColumn=”sex” TargetColumn=”Gender”
    DataType=”AnsiString” Length=”6″ CodePage=”1252″ />

Some remarks to the SQL statement:

  • We don’t really need the char(10), but the line break is convenient when we look at the results.
  • Since this query uses the for xml syntax and xml can’t handle the opening en closing tag signs. SQL server replaces them with their escape codes. We have to use the REPLACE function to change these escape codes back into the opening and closing tag signs.

Creating the stored procedure

For this example I would create the following stored procedures that combines the two metadata tables:

CREATE PROCEDURE meta.getPackageDetails (@MappingType varchar(50)) AS
SELECT
      PackageName = t.MappingName
    , TargetTable = t.tgtSchema + ‘.’ + t.tgtTable
    , TargetConnection = t.tgtConnection
    , SourceTable = t.srcSchema + ‘.’ + t.srcTable
    , SourceConnection = t.srcConnection
    , srcColumnList = STUFF((SELECT  ‘, ‘ + QUOTENAME(srcColumn)
        FROM meta.ColumnMapping
        WHERE srcColumn is not null
        AND TableMappingID=t.TableMappingID
        FOR XML PATH(”)),1,2,”)
    , DCColumns = REPLACE(REPLACE(STUFF(
        (SELECT  char(10) + ‘<Column SourceColumn=”‘ + srcColumn
        + ‘” TargetColumn=”‘ + tgtColumn
        + ‘” DataType=”‘ + tgtDataType
        + CASE WHEN tgtDataType=’AnsiString’
        THEN ‘” Length=”‘ + CAST(tgtLength AS varchar(10))
            + ‘” CodePage=”1252″ />’
        ELSE ‘” />’ END
        FROM meta.ColumnMapping
        WHERE TableMappingID=t.TableMappingID
        FOR XML PATH(”)),1,1,”),’&lt;’,'<‘),’&gt;’,’>’)
FROM meta.TableMapping t
WHERE t.MappingType = @MappingType

Less complex Biml file

The Biml file that you need to create the packages:

<Biml xmlns=”http://schemas.varigence.com/biml.xsd”>
  <#@ include file=”Connection.biml” #>

    <Packages>
    <# string sConn =  “Provider=SQLNCLI10;Server=.\\SQL2012;
       Initial Catalog=BimlSamples;Integrated Security=SSPI;”; #>
    <# string sSQL = string.Format(“Exec meta.getPackageDetails
      {0}”, “dim”); #>
    <# DataTable tblPackages = ExternalDataAccess.GetDataTable
      (sConn,sSQL); #>
    <# foreach (DataRow pkg in tblPackages.Rows){ #>
       
        <Package Name=”<#=pkg[“PackageName”]#>”
                ConstraintMode=”Linear”>
            <Tasks>

        <Dataflow Name=”DFT <#=pkg[“PackageName”]#>”>
          <Transformations>
            <OleDbSource Name=”ODS Source”
              ConnectionName=”<#=pkg[“SourceConnection”]#>” >
              <DirectInput>
                SELECT <#=pkg[“srcColumnList”]#>
                FROM <#=pkg[“SourceTable”]#>
              </DirectInput>
            </OleDbSource>
           
            <DataConversion Name=”DC”>
              <Columns>
                <#=pkg[“DCColumns”]#>
              </Columns>
            </DataConversion>

            <OleDbDestination Name=”ODD Target”
                ConnectionName=”<#=pkg[“TargetConnection”]#>”>
              <ExternalTableOutput
                 Table=”<#=pkg[“TargetTable”]#>”/>
            </OleDbDestination>

          </Transformations>
        </Dataflow>
      </Tasks>
        </Package>
        <#}#>
    </Packages>
</Biml>
<#@ import namespace=”System.Data” #>

Conclusion

In this blog post I discussed some ways to make you Biml files less complex. You can move (semi)-static content to other files and import them with the include directive or use the CallBimlScript function. And you can move parts of complicated Biml code to T-SQL. The choice of how much you will move to T-SQL will largely depend on your proficiency in C# or T-SQL.

8 Practical BIML Tips

You can leverage Business Intelligence Markup Language – or BIML – to automate the creation of Microsoft SQL Server Integration Services (SSIS) Packages. The popular open source BIDSHelper project includes Biml functionality, enabling anyone to write and execute BIML code for free. Among professional SSIS Developers BIML is quickly gaining popularity.
In this post I’ll share some practical tips ….

1. Better copy and paste experience

The irritating behavior of the XML editor in Visual Studio when copying BIML script and how to overcome it  has been documented on several blogs. In this overview of tips I couldn’t discard it.
In Visual Studio / SSDT … Select Options in the Tools menu.
In the Treeview of the Options form expand Text Editor, expand  XML and choose Formatting.
Next uncheck both options under Auto Reformat.

clip_image002

2. Put directive tags #@..# at the bottom of your BIML file

In many examples (including mine) these directive are placed at the top of the BIML file. Which makes sense because this is the default location in software development. However when Visual Studio opens a file with a directive before the <Biml> xml tag it doesn’t use the xml editor and we lose the formatting and intellisense features in the gui.
So instead place the directives at the bottom of the file. After the closing </Biml> tag. This will not have any effect on the creation of packages.

3. Vanilla Biml file

Before using script in a BIML file create a working BIML file that can create a package with more than 80% of the expected functionality.
Copy this file and use it as a base and then start scripting.
Why? The Combination of xml and code in one document makes it more complicated to select the correct BIML elements and attributes. Next use a small dataset so when you test your work only a small amount of packages are created.

4. Test and save often

During development regularly often check your work. Use the options: “Check Biml for Errors” or “Generate SSIS Packages” from the context menu. This way you not only test your work but save it as well.
Why? Debugging BIML files is mostly a pain. Error messages are limited and often refer to the wrong row and small typos can have a huge impact. So you better find your errors early in development.

5. Special XML characters

Xml has some special characters that you need to enclose in  a CDATA tag or replace the special character with its escape code:

  • double quote ” ( &quot; )
  • single quote ‘  ( &apos; )
  • less than sign  < ( &lt; )
  • greater than sign  >  ( &gt; )
  • ampersand & ( &amp; )

As an example suppose you have the following BIML:

then the xml processor will fail at Age > 25 As a remedy change your BIML into:

<Direct Input>
SELECT Name FROM dbo.People WHERE Age > 25
</Direct Input>

<Direct Input>
<![CDATA[SELECT Name FROM dbo.People WHERE Age > 25]]>
</Direct Input>, or

<Direct Input>
SELECT Name FROM dbo.People WHERE Age > 25
</Direct Input>

6. Special C# characters

C# also as some special characters that you will need to escape with a backslash . Most notably:

  • the backslash itself \ ( \\ )
  • single quote ‘ ( \’ )
  • double quote “ ( \” )

As an example escaping the backslash in a file location  
string FileName = “C:\\Dir\\File.txt”;
or use the verbatim string construction:
string FileName = @”C:\Dir\File.txt”;

7. Learn some basic C#

C# is the principal language of the .NET framework and is widely used for all sorts of programs: Web Services, Web Applications, Windows Form applications, SSIS Scripts, SQL CLR Stored Procedures etc. An investment in learning some C# will pay off. There is an abundant supply of websites and books with relevant information.

To get you started: read the chapters Basics  and  Flow control  off This tutorial: http://zetcode.com/lang/csharp/

8. Learn from the samples

Steal / use the samples on: