So far, we’ve been focused on tasks that are occurring within the immediate realm of ETL processing. You’ve looked at tasks for creating control structures, preparing data, and performing RDBMS operations. This section looks at being able to control other processes and applications in the operating system. Here we sidestep a bit from typical ETL definitions into things that can be more enterprise application integration (EAI) oriented. SSIS packages can also be organized to execute other packages or to call external programs that zip up files or send e-mail alerts, and even put messages directly into application queues for processing.
The Execute Package Task enables you to build SSIS solutions called parent packages that execute other packages called child packages. You’ll find that this capability is an indispensable part of your SSIS development as your packages begin to grow. Separating packages into discrete functional workflows enables shorter development and testing cycles and facilitates best development practices. Though the Execute Package Task has been around since the legacy DTS, several improvements have simplified the task:
The majority of configurable properties are in the Package tab of the Execute Package Task Editor. The first option provides the location of the child package. The ReferenceType option can be either External or Project References. This means you can point to a package inside your current project or outside the project to a SQL Server or file system. The best (easiest) option is to refer to a package in a project, as this option will easily “repoint” the reference as you migrate to production. If you point to an External Reference, you’ll need to create a Connection Manager that won’t automatically repoint as you migrate your packages from development to production. The configured tab will look like Figure 3-29.
Next, go to the Parameter Bindings tab to pass parameters into the child package. First, select any parameters in the child package from its dropdown box, and then map them to a parameter or variable in the parent package. Parameters will only work here with Project Referenced packages. You can see an example of this in Figure 3-30, or download the entire example from WWW.WROX.COM/GO/PROSSIS2014.
Execute Process Task
The Execute Process Task will execute a Windows or console application inside of the Control Flow. You’ll find great uses for this task to run command-line-based programs and utilities prior to performing other ETL tasks. The most common example would have to be unzipping packed or encrypted data files with a command-line tool.
You can store any errors resulting from the execution of the task into a variable that can be read later and logged. In addition, any output from the command file can also be written to a variable for logging purposes. Figure 3- 31 shows a sample of using the Execute Process Task to expand a compressed customers.zip file.
The Process tab in the Execute Process Task Editor contains most of the important configuration items for this task:
These variable values can be used to send back to a scripting component to log or can be used in a precedence constraint that checks the length of the variables to determine whether you should go to the next task. This provides the logical functionality of looping back and trying again if the result of the execution of the expand.exe program was a sharing violation or another similar error.
Other options in the Process tab include:
With the Execute Process Task, you can continue to use command-line or out-of-processes executables to organize work for ETL tasks. Now it’s time to take a look at how SSIS can interact and integrate with your enterprise messaging bus.
The Message Queue Task enables you to send or receive messages from Microsoft Message Queuing (MSMQ) right out of the box. For integration with other messaging systems like IBM’s MQ Series or Tibco’s Rendezveus, you need to either code to a library within a Script Task, create a custom component, or execute T-SQL statements to a SQL Server Service Broker queue. Messaging architectures are created to ensure reliable communication between two disparate subsystems.
Frequently Asked SSIS Interview Questions & Answers
A message can be a string, a file, or a variable. The main benefit to using this task is the capability to make packages communicate with each other at runtime. You can use this to scale out your packages, having multiple packages executing in parallel, with each loading a subset of the data, and then checking in with the parent package after they reach certain checkpoints. You can also use this task for enterprise-level information integration to do things like deliver dashboard-level information using XML files to an enterprise bus or distribute report content files across your network. Satellite offices or any other subscriber to those services could pull content from the queue for application-level processing.
The task is straightforward. In the General tab, shown in Figure 3-32, you specify the MSMQ Connection Manager under the MSMQConnection property. Then, you specify whether you want to send or receive a message under the Message option. In this tab, you can also specify whether you want to use the legacy Windows 2000 version of MSMQ; this option is set to false by default.
The bulk of the configuration is under the Send or Receive tab (the one you see varies according to the Message option you selected in the General tab). If you’re on the Receive tab, you can configure the task to remove the message from the queue after it has been read. You can also set the timeout properties here, to control whether the task will produce an error if it experiences a timeout.
Regardless of whether you’re sending or receiving messages, you can select the type of message under the MessageType option. You can either send or receive a string message, a variable, or a data file. Additionally, if you’re receiving a message, you can immediately store the message you receive in a package variable by setting String Message to Variable and then specifying a variable in the Variable option.
The Send Mail Task provides a configurable SSIS task for sending e-mail messages via SMTP. In legacy DTS packages, you had to send messages out through MAPI, which meant installing Outlook on the server on which the package was running. That is now no longer a requirement. Most of the configuration options are set in the Mail tab (shown in Figure 3-33) of the Send Mail Task Editor. The SmtpConnection property is where you either create a new or select an existing SMTP Connection Manager.
Most of the configuration options will depend upon your specific SMTP connection. One option of special interest is the MessageSourceType property, which specifies whether the message source will be provided from a file or a variable or be directly inputted into the MessageSource property. Typically, the best practice is to use a variable-based approach to set the MessageSource property.
Windows Management Instrumentation (WMI) is one of the best-kept secrets in Windows. WMI enables you to manage Windows servers and workstations through a scripting interface similar to running a T-SQL query. The WMI Data Reader Task enables you to interface with this environment by writing WQL queries (the query language for WMI) against the server or workstation (to look at the Application event log, for example). The output of this query can be written to a file or variable for later consumption. Following are some applications for which you could use the WMI Data Reader Task:
To get started, you first need to set up a WMI Connection Manager in the Connection Manager Editor. Connection requirements vary, but Figure 3-34 shows an example of a WMI connection for a typical standalone workstation.
Notice here that the Use Windows Authentication option has been set. WMI typically requires a higher level of security authorization because you are able to query OS-level data. With a WMI connection, you can configure the WMI Data Reader Task Editor using the WMI Options tab shown in Figure 3-35.
WQL queries look like SQL queries, and for all practical purposes they are, with the difference that you are retrieving data sets from the operating systems. For example, the following query selects the free space, the name, and a few other metrics about the C: drive (see code file Ch03SQL.txt):
SELECT FreeSpace, DeviceId, Size, SystemName, Description FROM
Win32_LogicalDisk
WHERE DeviceID = ‘C:’
The output of this type of query would look like this in a table:
Description, Local Fixed Disk
DeviceID, C:
FreeSpace, 32110985216
Size, 60003381248
SystemName, BKNIGHT
The following example of a WQL query selects information written to the Application event log after a certain date about the SQL Server and SSIS services (code file Ch03SQL.txt):
SELECT * FROM Win32_NTLogEvent WHERE LogFile = ‘Application’ AND
(SourceName=’SQLISService’ OR SourceName=’SQLISPackage’) AND
TimeGenerated>
‘20050117’
The results would look like this:
0B
KNIGHT
12289
1073819649
3S
ystem.String[]
Application
3738
SQLISPackage
20050430174924.000000-240
20050430174924.000000-240
Information
BKNIGHTBrian Knight
0
Typically, the WMI Data Reader Task is used in SQL Server administration packages to gather operational-type data from the SQL Server environments.
However, the WMI Event Watcher Task has some interesting uses for ETL
processes that you’ll look at next.
The WMI Event Watcher Task empowers SSIS to wait for and respond to certain WMI events that occur in the operating system. The task operates in much the same way as the WMI Data Reader Task operates. The following are some of the useful things you can do with this task:
To illustrate the last example of polling to determine when the CPU is less than 50 percent utilized, you could have the WMI Event Watcher Task look for an event with this WQL code:
SELECT * from __InstanceModificationEvent WITHIN 2 WHERE
TargetInstance ISA
‘Win32_Processor’ and TargetInstance.LoadPercentage < 50
The next section looks at a direct application of this WMI Event Watcher Task to give you a better idea of how to configure it and what it can do.
One very practical use of the WMI Event Watcher for ETL processing is to provide a buffer between the time when an SSIS job starts and the time when a file is actually delivered to a folder location. If there is a window of variability in file delivery and an SSIS package starts on a onetime schedule, then it is possible to miss processing the file for the day. By using a WMI Event Watcher, you can set up your SSIS packages to poll a folder location for a set period of time until a file is detected. If you have this type of challenge, a better solution may be a ForEach Loop Container scheduled to run periodically, but you’ll learn more about that in Chapter 6. To set up a task to perform this automated action, open the WMI Options tab of the WMI Event Watcher Task Editor (see Figure 3-36). Notice that this WMI Task is completely different from the WMI Data Reader Task.
This WMI Event Watcher Task provides properties such as the AfterEvent option, which specifies whether the task should succeed, fail, or keep querying if the condition is met. You also need to provide a length of time after which the WMI Event Watcher stops watching by setting the Timeout property. The timeout value is in seconds. The default of zero (0) indicates that there is no timeout. Outside of your development activities, be very careful with leaving this setting on zero (0). The WMI Event Watcher could leave your SSIS package running indefinitely.
You can also configure what happens when a timeout occurs under the ActionAtTimeout and AfterTimeout settings. The NumberOfEvents option configures the number of events to watch for. You can use this setting to look for more than one file before you start processing.
The WqlQuerySource for the File Watcher Configuration for this WMI Task would look like this code:
SELECT * FROM __InstanceCreationEvent WITHIN 10
WHERE TargetInstance ISA “CIM_DirectoryContainsFile”
AND TargetInstance.GroupComponent =
“Win32_Directory.Name=”c:ProSSIS””
If you run this task with no files in the C:ProSSIS directory, the task will remain yellow as the watcher continuously waits for an event to be raised. If you copy a file into the directory, the task will turn green and complete successfully. This is a great addition that is less resource-intensive than the legacy DTS version of iterating in a For loop until the file is found. As you can see, there are some major improvements in the capabilities to control workflow in SSIS.
SSRS | Power BI |
SSAS | SQL Server |
SCCM | SQL Server DBA |
SharePoint | BizTalk Server |
Team Foundation Server | BizTalk Server Administrator |
Name | Dates | |
---|---|---|
SSIS Training | Sep 21 to Oct 06 | View Details |
SSIS Training | Sep 24 to Oct 09 | View Details |
SSIS Training | Sep 28 to Oct 13 | View Details |
SSIS Training | Oct 01 to Oct 16 | View Details |
Yamuna Karumuri is a content writer at Mindmajix.com. Her passion lies in writing articles on IT platforms including Machine learning, PowerShell, DevOps, Data Science, Artificial Intelligence, Selenium, MSBI, and so on. You can connect with her via LinkedIn.