Mindmajix

Scripting in SSIS

WHAT’S IN THIS TOPIC ?

  • Selecting your scripting language and getting started
  • Adding assemblies to SSIS Script objects
  • Understanding Script Task usage
  • Understanding Script Component usage
  • Using external SSIS objects from within a script
  • Using events and logging in debugging scripts

Scripting is the Swiss Army knife of SSIS. As shown in Creating an End-to-End Package Topic, many different SSIS features are available out-of-the-box. If you need to do something that you just can’t find anywhere else, you will find additional functionality in three features: the Script Task, the Script Component, and expressions. Expressions, covered in Using Variables, Parameters, and Expressions Topic, are small scripts that set properties. The other two scripting concepts provide access into a scripting development environment using Microsoft Visual Studio Tools for Applications (VSTA) that enables SSIS developers to script logic into packages using Microsoft Visual Basic 2012 or Microsoft Visual C# 2012 .NET code.

In this Topic , you will learn the differences between these script components and when you should use one over the other. You’ll also learn all about the various scripting options available and how to use them in your package development tasks to control execution flow, perform custom transformations, manage variables, and provide runtime feedback.

Introduction

If you think of scripting as something that compiles at runtime and contains unstructured or unmanaged coding languages, then scripting in SSIS does not conform to your idea of scripting. Conversely, if you think of scripting as using small bits of code in specialized places to execute specific tasks, then SSIS scripting won’t be an alien concept. It is helpful to understand why scripting is separated according to functional usage. In this Topic , you will examine the differences and look into the scripting IDE environment, walking through the mechanics of applying programmatic logic into these components — including how to add your own classes and compiled assemblies.

ETL developers have had creative ways of handling logic in their packages. Specifically, digging into what developers were doing, the functional activities can be divided into the following categories:

  • Retrieving or setting the value of package variables
  • Retrieving or setting properties within the package
  • Applying business logic to validate or format data
  • Controlling workflow in a package

Retrieving and setting the value of variables and package properties is so prevalent an activity that the SSIS team creates a completely separate feature that enabled this to be less of a programmatic task. Using the Expression Builder, you can easily alter package components by setting component properties to an expression or a variable that represents an expression.

NOTE Refer to Using Variables, Parameters, and Expressions Topic for detailed information about how to use expressions, parameters, and variables.

To modify properties that connect to, manipulate, or define data, you can use the Data Flow Task to visually represent this activity. However, to achieve functionality not provided out of the box, you still need scripting, so the Script Component was added. The primary role of the Script Component is to extend the Data Flow capabilities and allow programmatic data manipulation within the context of the Data Flow. However, it can do more, as you’ll learn later in this Topic.

To continue to enable the numerous miscellaneous tasks that are needed in ETL development, use the Script Task, which can be used only in the Control Flow design surface. In this task, you can perform various manipulations within the managed code framework of .NET.

The Script Task and Script Component use the Visual Studio Tools for Applications (VSTA) environment. VSTA is essentially a scaled-down version of Visual Studio that can be added to an application that allows coding extensions using managed code and .NET languages. Even though SSIS packages are built inside of Visual Studio, when you are in the context of a Script Task or Script Component, you are actually coding in the VSTA environment that is, in fact, a mini-project within the package. The VSTA IDE provides IntelliSense, full edit-and-continue capabilities, and the ability to code in either Visual Basic or C#. You can even access some of the .NET assemblies and use web references for advanced scripting.

NOTE To gain the most from this scripting Topic , you need a basic understanding of programming in either C# or Visual Basic. If you don’t already have it, you can obtain this knowledge from either Beginning Visual C# 2012 Programming by Karli Watson and colleagues (Wrox; ISBN: 978-1-118-31441-8) or Beginning Visual Basic 2012 by Bryan Newsome (Wrox; ISBN: 978-1-118-31181-3).

GETTING STARTED IN SSIS SCRIPTING

The Script Task and Script Component have greatly increased your possibilities when it comes to script-based ETL development in SSIS. However, it is important to know when to use which component and what things can be done in each.

The following matrix explains when to use each component:

Screenshot_247

To get a good look at the scripting model, the next example walks through a simple “Hello World” coding project in SSIS. Although this is not a typical example of ETL programming, it serves as a good introduction to the scripting paradigm in SSIS, followed by the specific applications of the Script Task and Script Component.

Selecting the Scripting Language

SSIS allows the developer to choose between two different scripting languages: C# or Visual Basic (VB). To see where you can make this choice, drop a Script Task onto the Control Flow design surface. Right-click the Script Task and click Edit from the context menu. The first thing you’ll notice is the availability of two scripting languages: Microsoft Visual C# 2012 and Microsoft Visual Basic 2012 in the ScriptLanguage property of the task. below screen shot shows these options in the Script Task Editor.

Script task editor

After clicking the Edit Script button, you’ll be locked into the script language that you chose and you won’t be able to change it without deleting and recreating the Script Task or Script Component. This is because each Script item contains its own internal Visual Studio project in VB or C#. You can create separate Script items whereby each one uses a different language within a package. However, using Script items in both languages within the same package is not recommended, as it makes maintenance of the package more complex. Anyone maintaining the package would have to be competent in both languages.

Using the VSTA Scripting IDE

Clicking the Edit Script button on the editor allows you to add programmatic code to a Script Task or Script Component. Although the Script Task and Script Component editors look different, they both provide an Edit Script button to access the development IDE for scripting, as shown in below screen shot

development IDE for scripting

Once you are in the IDE, notice that it looks and feels just like Visual Studio. below screen shot shows an example of how this IDE looks after opening the Script Task for the VB scripting language.

this IDE looks after opening the Script Task for the VB scripting language

The code window on the left side of the IDE contains the code for the item selected in the Solution Explorer on the top-right window. The Solution Explorer shows the structure for the project that is being used within the Scripting Task. A complete .NET project is created for each Script Task or Component and is temporarily written to a project file on the local drive where it can be altered in the Visual Studio IDE. This persistence of the project is the reason why once you pick a scripting language, and generate code in the project, you are locked into that language for that Scripting item. Notice in above the screenshot that a project has been created with the namespace of ST_a8363e166ca246a3bedda7. However, you can’t open this project directly, nor need you worry about the project during deployment. These project files are extracted from stored package metadata. With the project created and opened, it is ready for coding.

Example: Hello World

In the IDE, the Script Task contains only a class named ScriptMain. In the entry-point function, Main(), you’ll put the code that you want executed. Part of that code can make calls to additional functions or classes. However, if you want to change the name of the entry-point function for some reason, type the new name in the property called EntryPoint on the Script page of the editor. (Alternatively, you could change the name of the entry point at runtime using an expression.)

In the VSTA co-generated class ScriptMain, you’ll also see a set of assembly references already added to your project, and namespaces set up in the class. Depending upon whether you chose VB or C# as your scripting language, you’ll see either:

C#

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

or

VB

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

These assemblies are needed to provide base functionality as a jump-start to your coding. The remainder of the class includes VSTA co-generated methods for startup and shutdown operations, and finally the entry-point Main() function, shown here in both languages:

C#

public void Main()
{/
/ TODO: Add your code here
Dts.TaskResult = (int)ScriptResults.Success;
}

VB

Public Sub Main()

Add your code here

Dts.TaskResult = ScriptResults.Success
End Sub

Note that the Script Task must return a result to notify the runtime of whether the script completed successfully or not. The result is passed using the Dts.TaskResult property. By your setting the result to ScriptResults.Success, the script informs the package that the task completed successfully.

NOTE The Script Component does not have to do this, since it runs in the context of a Data Flow with many rows. Other differences pertaining to each component are discussed separately later in the topic.

To get a message box to pop up with the phrase “Hello World!” you need access to a class called MessageBox in a namespace called System.Windows.Forms. This namespace can be called directly by its complete name, or it can be added after the Microsoft.SqlServer.Dts.Runtime namespace to shorten the coding required in the class. Both of these methods are shown in the following code (ProSSIS\Code\Ch09_ProSSIS\02HelloWorld.dtsx) to insert the MessageBox code into the Main() function:

C#

using System.Windows.Forms;

MessageBox.Show(“Hello World!”);
Or
System.Windows.Forms.MessageBox.Show(“Hello World!”);

VB

Imports System.Windows.Forms

MessageBox.Show(“Hello World!”)
Or
System.Windows.Forms.MessageBox.Show(“Hello World!”)

Get in the habit now of building the project after adding this code. The Build option is directly on the menu when you are coding. Previous versions of SSIS gave you the opportunity to run in precompile or compiled modes. SSIS now will automatically compile your code prior to executing the package at runtime. Compiling gives you an opportunity to find any errors before the package finds them. Once the build is successful, close the IDE and the editor, and right-click and execute the Script Task. A pop-up message box should appear with the words “Hello World!” (see below the screen shot).

Hello World popup

Adding Code and Classes

Using modal message boxes is obviously not the type of typical coding desired in production SSIS package development. Message boxes are synchronous and block until a click event is received, so they can stop a production job dead in its tracks. However, this is a basic debugging technique to demonstrate the capabilities in the scripting environments before getting into some of the details of passing values in and out using variables. You also don’t want to always put the main blocks of code in the Main() function. With just a little more work, you can get some code reuse from previously written code using some cut-and-paste development techniques. At the very least, code can be structured in a less procedural way. As an example, consider the common task of generating a unique filename for a file you want to archive.

Typically, the filename might be generated by appending a prefix and an extension to a variable like a guid. These functions can be added within the ScriptMain class bodies to look like this (ProSSIS\Code\Ch09_ProSSIS\03BasicScript.dtsx):

C# 

Public partial class ScriptMain
{

public void Main()
{S
ystem.Windows.Forms.MessageBox.Show(GetFileName(“bankfile”, “txt”));
Dts.TaskResult = (int)ScriptResults.Success;
}p
ublic string GetFileName(string Prefix, string Extension)
{r
eturn Prefix + “-” + Guid.NewGuid().ToString() + “.” + Extension;
}}

VB

Partial Class ScriptMain

Public Sub Main()
System.Windows.Forms.MessageBox.Show(GetFileName(“bankfile”, “txt”))
Dts.TaskResult = ScriptResults.Success
End Sub
Public Function GetFileName(ByVal Prefix As String, _
ByVal Extension As String) As String
GetFileName = Prefix + “-” + Guid.NewGuid.ToString + _
“.” + Extension
End Function
End Class

Instead of all the code residing in the Main() function, structured programming techniques can separate and organize SSIS scripting. In this example, the GetFileName function builds the filename and then returns the value, which is shown in a message box, as shown in below screen shot.

message box,

Of course, copying and pasting the same code into multiple Script Tasks is pretty inefficient and produces solutions that are difficult to maintain. If you have preexisting compiled code, shouldn’t you be able to reuse this code without finding the original source for the copy-and-paste operation? What a great question! You can, with some caveats.

Using Managed Assemblies

Reusing code, no matter what language it was written in, increases the maintainability and supportability of an application. While you can only write SSIS scripts using Visual Basic and C#, SSIS provides the capability to reuse code by reusing assemblies that are part of the .NET Framework or any assembly created using a .NET-compliant language, including C#, J#, and even Delphi, but there are some important qualifications:

  • For a managed assembly to be used in an Integration Service, you must install the assembly in the global assembly cache (GAC).
  • All dependent or referenced assemblies must also be registered in the GAC. This implies that the assemblies must be strongly named.
  • For development purposes only, VSTA can use managed assemblies anywhere on the local machine.

If you think about this it makes sense, but within SSIS, it might seem confusing at first. On the one hand, a subproject is created for the Script Task, but it is deployed as part of the metadata of the package; there is no separate physical DLL file for the assembly. In this case, you don’t have to worry about deployment of individual script projects. However, when you use an external assembly, it is not part of the package metadata, and here you do have to worry about deployment of the assembly. Where then do you deploy the assembly you want to use? Because SSIS packages are typically deployed within SQL Server, the most universal place to find the assembly would be in the GAC.

If you are using any of the standard .NET assemblies, they are already loaded and stored in the GAC and the .NET Framework folders. As long as you are using the same framework for your development and production locations, using standard .NET assemblies requires no additional work in your environment. To use a standard .NET assembly in your script, you must reference it. To add a reference in a scripting project, open the VSTA environment for editing your script code — not the SSIS package itself. Right- click the project name in the Solution Explorer or go to the Project menu and select the Add Reference option. The new Reference Manager dialog will appear, as in below screen shot.

reference manager

Select the assemblies from the list that you wish to reference and click the OK button to add the references to your project. Now you can use any objects located in the referenced assemblies either by directly referencing the full assembly or by adding the namespaces to your ScriptMain classes for shorter references, similar to the Windows Forms assembly used in the Hello World example. References can be removed from the project References screen. Find this screen by double-clicking the My Project node of the Solution Explorer. Select the References menu to see all references included in your project. To remove a reference, select the name and click the Delete key.

Example: Using Custom .NET Assemblies

Although using standard .NET assemblies is interesting, being able to use your own compiled .NET assemblies really extends the capabilities of your SSIS development. Using code already developed and compiled means not having to copy-and-paste code into each Script Task, enabling you to reuse code already developed and tested. To examine how this works, in this section you’ll create an external custom .NET library that can validate a postal code and learn how to integrate this simple validator into a Script Task. (To do this, you need the standard class library project templates that are part of Visual Studio. If you installed only SQL Server Data Tools, these templates are not installed by default.) You can also download the precompiled versions of these classes.

To start, open a standard class library project in the language of your choice, and create a standard utility class in the project that looks something like  his (proSSIS\Code\Ch09_ProSSIS\SSISUtilityLib_VB\SSISUtilityLib_VB\DataUti)

C#

using System.Text.RegularExpressions;
namespace SSISUtilityLib_CSharp
{p
ublic static class DataUtilities
{p
ublic static bool isValidUSPostalCode(string PostalCode)
{r
eturn Regex.IsMatch(PostalCode, “^[0-9]{5}(-[0-9]{4})?$”);
}
}
}

VB

Imports System.Text.RegularExpressions
Public Class DataUtilities
Public Shared Function isValidUSPostalCode
(ByVal PostalCode As String) As Boolean
isValidUSPostalCode = Regex.IsMatch(PostalCode, “^[0-9]{5}(-[0-9]
{4})?$”)
End Function
End Class

Because you are creating projects for both languages, the projects (and assemblies) are named SSISUtilityLib_VB and SSISUtilityLib_Csharp. Notice the use of static or shared methods. This isn’t required, but it’s useful because you are simulating the development of what could later be a utility library loaded with many stateless data validation functions. A static or shared method allows the utility functions to be called without instantiating the class for each evaluation.

Now sign the assembly by right-clicking the project to access the Properties menu option. In the Signing tab, note the option to “Sign the assembly,” as shown in below screen shot. Click New on the dropdown and name the assembly to have a strong name key added to it.

ssis scripting

In this example, the VB version of the SSISUtilityLib project is being signed. Now you can compile the assembly by clicking the Build option in the Visual Studio menu. The in-process DLL will be built with a strong name, enabling it to be registered in the GAC.

On the target development machine, open a command-line prompt window to register your assembly with a command similar to this:

C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0
Tools>
Gacutil /i
C:\ProSSIS\Code\SSISUtilityLib_CSharp\SSISUtilityLib_CSharp\
bin\Release\SSISUtilityLib_CSharp.dll

NOTE Note that you may have to run the command line as administrator or have the User Access Control feature turned off to register the assembly.

If you are running on a development machine, you also need to copy the assembly into the appropriate .NET Framework directory so that you can use the assembly in the Visual Studio IDE. Using the Microsoft .NET Framework Configuration tool, select Manage the Assembly Cache. Then select Add an Assembly to the Assembly Cache to copy an assembly file into the global cache.

NOTE For a detailed step-by-step guide to the deployment, see the SSIS Developer’s Guide on Custom Objects located at http://msdn.microsoft.com/en-us/library/ms403356(v=SQL.110).aspx.

To use the compiled assembly in an SSIS package, open a new SSIS package and add a new Script Task to the Control Flow surface. Select the scripting language you wish and click Edit Script. You’ll need to right-click the Solution Explorer node for references and find the reference for SSISUtilityLib_VB.dll or SSISUtilityLib_CSharp.dll depending on which one you built. If you have registered the assembly in the GAC, you can find it in the .NET tab. If you are in a development environment, you can simply browse to the .dll to select it.

Add the namespace into the ScriptMain class. Then add these namespaces to the ScriptMain class:

C#

using SSISUtilityLib_CSharp;

VB

Imports SSISUtilityLib_VB

Note that the SSIS C# Script Task in the sample packages you’ll see if you download the Topic materials from www.wrox.com/go/prossis2014 use both the C# and the VB versions of the utility library. However, this is not required. The compiled .NET class libraries may be intermixed within the SSIS Script Task or Components regardless of the scripting language you choose.

Now you just need to code a call to the utility function in the Main() function like this (ProSSIS\Code\Ch09_ProSSIS\04SSISPackageUsingAssembly.dtsx):

C#

public void Main()
{s
tring postalCode = “12345-1111”;
string msg = string.Format(
“Validating PostalCode {0}\nResult..{1}”, postalCode,
DataUtilities.isValidUSPostalCode(postalCode));
MessageBox.Show(msg);
Dts.TaskResult = (int)ScriptResults.Success;
}

VB

Public Sub Main()
Dim postalCode As String = “12345-1111”
Dim msg As String = String.Format(“Validating PostalCode {0}” + _
vbCrLf + “Result..{1}”, postalCode, _
DataUtilities.isValidUSPostalCode(postalCode))
MessageBox.Show(msg)
Dts.TaskResult = ScriptResults.Success
End Sub

Compile the Script Task and execute it. The result should be a message box displaying a string to validate the postal code 12345-1111. The postal code format is validated by the DataUtility function IsValidUSPostalCode. There was no need to copy the function in the script project. The logic of validating  the format of a U.S. postal code is stored in the shared DataUtility functionand can easily be used in both Script Tasks and Components with minimal coding and maximum consistency. The only downside to this is that there is now an external dependency in the SSIS package upon this assembly. If the assembly changes version numbers, you’ll need to open and recompile all the script projects for each SSIS package using this. Otherwise, you could get an error if you aren’t following backward compatibility guidelines to ensure that existing interfaces are not broken. If you have a set of well-tested business functions that rarely change, using external assemblies may be a good idea for your SSIS development.


 

0 Responses on Scripting in SSIS"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.