Everyone has been there — one of your Data Transformation Services (DTS) packages failed overnight, and you now have to completely rerun the package. This is particularly painful if some of the processes inside the package are expensive in terms of resources or time. In DTS, it wasn’t possible to restart a package from where it left off, and picking apart a package to run just those tasks that failed was tedious and error prone. A variety of exotic solutions have been used, such as a post-execution process that goes into the package and recreates the package from the failed step forward. Although this worked, it required someone with a detailed knowledge of the DTS object model, which most production DBAs did not have. If your process takes data from a production SQL Server that has a very small window of ETL opportunity, you can be sure that most DBAs are not going to be pleased when you tell them you need to run the extract again, and that it may impact the users.
For this reason, “package restartability” or checkpoints in SQL Server Integration Services was a huge relief. In this Reliability and Scalability Topic, you are going to learn everything you need to know to enable restartability in your SSIS packages.
Checkpoints are the foundation for restarting packages in SSIS, and they work by writing state information to a file after each task completes. This file can then be used to determine which tasks have run and which have failed. More detail about these files is provided in the “Inside the Checkpoint File” section. To ensure that the checkpoint file is created correctly, you must set three package properties and one task property, which can be found on the property pages of the package and task. The package properties are as follows:
- CheckpointFilename: This is the filename of the checkpoint file, which must be provided. There are no specific conventions or requirements for the filename.
- CheckpointUsage: There are three values, which describe how a checkpoint file is used during package execution:
Never: The package will not use a checkpoint file and therefore will never restart.
If Exists: If a checkpoint file exists in the place you specified for the CheckpointFilename property, then it will be used, and the package will restart according to the checkpoints written.
Always: The package will always use a checkpoint file to restart; if one does not exist, the package will fail.
- SaveCheckpoints: This is a simple Boolean to indicate whether checkpoints are to be written. Obviously, this must be set to true for this scenario.
The one property you have to set on the task is FailPackageOnFailure. This must be set for each task or container that you want to be the point for a checkpoint and restart. If you do not set this property to true and the task fails, no file will be written, and the next time you invoke the package, it will start from the beginning again. You’ll see an example of this happening later.
NOTE As you know, SSIS packages are broken down into the Control Flow and Data Flow Tasks. Checkpoints occur only at the Control Flow; it is not possible to checkpoint transformations or restart inside a Data Flow Task. The Data Flow Task can be a checkpoint, but it is treated like any other task. This ensures Data Flow Tasks write all or none of their data, like a SQL transaction with commits and rollbacks. Implementing your own checkpoint and restart feature for data is described later in the Topic.
Keep in mind that if nothing fails in your package, no file will be generated. You’ll have a look later at the generated file itself and try to make some sense out of it, but for now you only need to know that the file contains all the information needed by the package when it is restarted, enabling it to behave like nothing untoward had interrupted it. That’s enough information to be able to start using checkpoints in your packages, so now you can proceed with some examples.
Simple Control Flow
The first example package you will create contains a simple Control Flow with a series of tasks meant to highlight the power of checkpoints. Create a package named Checkpoint.dtsx that contains three Execute SQL Tasks, as shown in below screen shot.
In the new package, add a connection manager to the AdventureWorksDW database. Then, add a simple select statement, such as “select 1” to the first and third tasks. The second of these tasks, aptly named “2,” is set to fail with a divide-by-zero error, as shown in the Task Editor in below screen shot.
Assume that the task labeled “1” is expensive, and you want to ensure that you don’t need to execute it twice if it finishes and something else in the package fails. You now need to set up the package to use checkpoints and the task itself. First, set the properties of the package described earlier, as shown in below screen shot.
Next, set the properties of the task labeled “2” to use checkpoints (see below screen shot). Change the FailPackageOnFailure property to True.
Now you can execute the package. The expected outcome is shown in below screen shot — the first task completes successfully with a check mark in a green circle, but the second task fails with an X in a red circle. The third task did not execute. Because the screenshots for this Tutorial are in black and white, you won’t be able to see the colors here, but you can see that the tasks contain different symbols in the top-right corner.
If you had created this package in DTS, you would have had to write some logic to cope with the failure in order to not have to execute task 1 again. Because you are working in SSIS and have set the package up properly, you can rely on checkpoints. When the package failed, the error output window said something like this:
SSIS package “C:\ProSSIS\Code\Ch15\01_Checkpoints.dtsx” starting.
Information: 0x40016045 at 01_Checkpoints: The package will be saving
checkpoints to file “C:\ProSSIS\Files\Checkpoint.xml” during
execution. The package is configured to save checkpoints.
Information: 0x40016049 at 01_Checkpoints: Checkpoint file
“C:\ProSSIS\Files\Checkpoint.xml” update starting.
Information: 0x40016047 at 1: Checkpoint file
“C:\ProSSIS\Files\Checkpoint.xml” was updated to record completion of
Error: 0xC002F210 at 2, Execute SQL Task: Executing the query “Select
1/0” failed with the following error: “Divide by zero error
encountered.”. Possible failure reasons: Problems with the query,
“ResultSet” property not set correctly, parameters not set correctly,
or connection not established correctly.
Task failed: 2
SSIS package “C:\ProSSIS\Code\Ch15\01_Checkpoints.dtsx” finished:
As you can see, the output window says that a checkpoint file was written. If you look at the file system, you can see that this is true, as shown in below screen shot. You’ll look inside the file later when you have a few more things of interest in there, but for the moment just understand that the package now knows what happened and where.
Now you need to fix the problem by removing the divide-by-zero issue in the second task (change the SQL Statement to Select 1 instead of Select 1/0) and run the package again. below screen shot shows what happens after you do that.
Task 2 was executed again and then task 3. Task 1 was oblivious to the package running again.
Recall from earlier that the task you want to be the site for a checkpoint must have the FailPackageOnFailure property set to true. Otherwise, no file will be written; and when the package executes again it will start from the beginning. Here is how that works. Set task 2 to not use checkpoints by setting this property to false, as shown in below screen shot.
Change the SQL Statement in task 2 back to Select 1/0 and then execute the package again. No checkpoint file is written, as expected. This means that after you fix the error in the task again and rerun the package one more time, the results look like Figure 15-9; all tasks have a green check mark in the top right corner, which may or may not be what you want.
This example is a very simple one that involves only three tasks joined by a workflow, but hopefully it has given you an idea about restartability in SSIS packages; the examples that follow are more complicated.
Containers within Containers and Checkpoints
Containers and transactions have an effect on checkpoints. You will see these effects in this example, and change some properties and settings while you’re at it. First, you will create a new package named ContainerCheckpoints.dtsx using Sequence Containers and checkpoints. In this package you have two Sequence Containers, which themselves contain Execute SQL Tasks, as shown in below screen shot.
Make sure the package has all the settings necessary to use checkpoints, as in the previous example. On the initial run-through of this package, the only container that you want to be the site for a checkpoint is task 3, so set the FailPackageOnFailure property of task 3 to true. below screen shot shows what happens when you deliberately set this task to fail, perhaps with a divide-byzero error (refer to the previous example to see how to do that).
As expected, task 3 has failed, and the Sequence Container, SEQ 2, has also failed because of this. If you now fix the problem with task 3 and re-execute the package, you will see results matching those shown in below screen shot.
Therefore, there’s no real difference here from the earlier example except that the Sequence Container “SEQ 2” has a green check. Now you’ll change the setup of the package to see the behavior change dramatically. What you’re going to do is make the Sequence Container SEQ 2 transacted. That means you’re going to wrap SEQ 2 and its child containers in a transaction. Change the properties of the SEQ 2 container to look like below screen shot so that the TransactionOption property is set to Required.
Setting the TransactionOption property of the SEQ 2 container to Required means that it will start its own transaction. Now open the properties window of the two child Execute SQL Tasks and set their TransactionOption properties to Supported, as shown in below screen shot, so that they will join a transaction if one exists.
WARNING SSIS uses the Microsoft Distributed Coordinator to manage its transactions. Before setting the TransactionOption property to Required, be sure to start Microsoft Distributed Transaction Coordinator (MSDTC) service
Now execute the package again. On the first run-through, the package fails as before at task 3. The difference occurs when you fix the problem with task 3 and re-execute the package. The result looks like below screen shot.
Because the container was transacted, the fact that task 3 failed is not recorded in the checkpoint file. Instead, the fact that the Sequence Container failed is recorded; hence, the Sequence Container is re-executed in its entirety when the package is rerun. Therefore, tasks 2 and 3 execute on the second run. Note that the transaction was on the Sequence Container and not the individual tasks. If you have a set of tasks that need to be run in a transaction, the Sequence Container will handle this for you.
Variations on a Theme
You may have noticed another property in the task property pages next to the FailPackageOnFailure property — the FailParentOnFailure property. In the previous example, the SEQ 2 container is the parent to the two Execute SQL Tasks 2 and 3. You’ll run through a few variations of the parent/child relationship here so that you can see the differences. In each example, you will force a failure on the first run-through; then you will correct the problem and run the package a second time.
Failing the Parent, Not the Package
What happens if instead of setting the FailPackageOnFailure property of task 3 to true, you set the FailParentOnFailure property to true? After a failed execution and then issue resolution, the whole package will be run again on re-execution of the package. Why? Because no checkpoint file has been written.
NOTE Remember that if you want a checkpoint file to be written, the task that fails must have the FailPackageOnFailure property set to true; otherwise, no file is written.
Failing the Parent and the Package
In this variation, you still have a transacted Sequence Container, and you still have task 3’s FailParentOnFailure property set to true. In addition, set the SEQ 2 Sequence Container’s FailPackageOnFailure property to true. below screen shot shows what happens on the rerun of the package after a failure.
As you can see, the Sequence Container executes in its entirety, and the output window from the package confirms that you used a checkpoint file and started a transaction:
SSIS package “C:\ProSSIS\Code\Ch15\02_ContainerCheckpoints.dtsx”
Information: 0x40016046 at 02_ContainerCheckpoints: The package
restarted from checkpoint file
“C:\ProSSIS\Files\ContainerCheckpoint.xml”. The package was
configured to restart from checkpoint.
Information: 0x40016045 at 02_ContainerCheckpoints: The package will
be saving checkpoints to file
“C:\ProSSIS\Files\ContainerCheckpoint.xml” during execution. The
package is configured to save checkpoints.
Information: 0x4001100A at SEQ 2: Starting distributed transaction
for this container.
Information: 0x4001100B at SEQ 2: Committing distributed transaction
started by this container.
SSIS package “C:\ProSSIS\Code\Ch15\02_ContainerCheckpoints.dtsx”
Failing the Task with No Transaction
The next variation will show what happens if you set some of the checkpoint properties without having a transaction in place. Start by removing the transactions from your package by setting the SEQ 2’s TransactionOption property to Supported. Force an error in task 3, and run the package again to see it fail at task 3. Then, fix the problem, and re-execute the package. Remember that task 3 has its FailParentOnFailure property set to true, and the SEQ 2 Sequence Container has its FailPackageOnFailure set to true. The outcome, shown in Figure 15-17, is not exactly what you likely expected. As you can imagine, this is not very useful. The Sequence Container shows success, yet none of the tasks in the container executed. It’s important to ensure that you have the properties of the tasks and the containers set properly in order to ensure that the desired tasks execute after the checkpoint is created.
Failing the Package, Not the Sequence
You might assume that if tasks 2 and 3 have the Sequence Container as a parent, then the package itself must be the parent of the Sequence Container. If this is the case, wouldn’t setting FailParentOnFailure on the Sequence Container be the same as setting FailPackageOnFailure on the same container? The quick answer is no. If you try this, you will see that no checkpoint file is written, and by now you know what that means. The message here is that if you want a checkpoint file to be written, then make sure that the restart point has FailPackageOnFailure set to true.
Inside the Checkpoint File
As promised earlier, it’s time to look inside the checkpoint file to see what it contains now that you have more things to put in there. In the CheckpointScripts.dtsx package shown in below screen shot, although you have only three tasks, you also have a variable value being changed. The purpose of this package is to show you what kind of information is stored in a checkpoint file. To add a variable, simply click the designer while in the Control Flow and choose Variables from the SSIS menu or right-click on the Control Flow background and select Variables. Create a Variable named intVar and make the variable type integer. Leave the default value of zero.
The Script Tasks will change the value of the intVar variable to 1 and then to 2, and the last Script Task will try to set the variable to the letter “x”. This will cause the last Script Task to fail because of nonmatching data types. To alter the value of a variable in all Script Tasks, you add the variable name to the ReadWriteVariables section on the Script Task’s editor. You then need to add some script to change the value. The following is the Visual Basic script in the first Script Task:
Public Sub Main()
Dts.Variables(“intVar”).Value = 1
Dts.TaskResult = ScriptResults.Success
Now run the package. It will fail, as shown in Figure 15-19. A Script Task Error Window may appear. If so, just click Close.
NOTE Instead of altering the Script Task code to make your task fail, you can simply set the ForceExecutionResult on the task to Failure. You can also do this on any Containers too.
<DTS:Variable DTS:Namespace=”User” DTS:IncludeInDebugDump=”6789″
The file is easier to understand broken down into its constituent parts. The first part tells you about the package to which this file applies:
The next section of the file, the longest part, details the package variable that you were manipulating:
<DTS:Variable DTS:Namespace=”User” DTS:IncludeInDebugDump=”6789″
One of the most important things this part of the file tells you is that the last value assigned to the variable, intVar, was 2. When the package re-executes, it is this value that will be used.
The final part of the file tells you about the tasks in the package and what their outcomes were. It tells you only about the two tasks that succeeded, not the one that failed:
The first container mentioned is the “Set intVar value to 2” Task:
The next, and final, task to be mentioned is the “Set intVar value to 1” Task:
Checkpoints are a great help in controlling the start location in SSIS. This restartability can result in a huge time savings. In packages with dozens or even hundreds of tasks, you can imagine the time saved by skipping tasks that do not need to be executed again. Keep in mind that if two tasks are required to run together during one package run, place these tasks in a Sequence Container, set the container to fail the package, and set the tasks to fail the parent.