There are several ways to execute SSIS packages that have been deployed to the SSIS catalog, and my favorite way of performing this task is to execute SSIS packages with T-SQL. There are a couple of challenges that come up when executing catalog-deployed packages using the default settings in T-SQL, but I have workaround for those issues which I’ll cover in this post.
In this post I will demonstrate a couple of my favorite tricks for improving the functionality and output of SSIS catalog package executions invoked from T-SQL.
Execute SSIS Packages with T-SQL
To quickly review, there are two stored procedures used to execute catalog-stored SSIS packages:
- catalog.create_execution: This stored procedure is used to set up the execution in the SSIS catalog, but doesn’t actually execute the package. This initial creation step allows us to set up parameters or add data taps before the execution begins. This stored procedure has a BIGINT output parameter named @execution_id that is used as a handle for other pre-execution steps and the actual execution of the package. Had I been designing this system, I would have configured this to be the default behavior. It’s a rare case that I actually need to make an asynchronous call to [catalog].[start_execution]. From a workflow perspective, I almost always need the script to run until the package is finished.
- catalog.start_execution: This stored procedure kicks off the execution in the SSIS catalog. The execution ID to be started is the one returned in the @execution_id output parameter from catalog.create_execution above.
Used together, these stored procedures manage the creation and execution of catalog executions, and they work quite well. As shown below, the create_execution stored procedure is executed, followed by start_execution:
However, there are a couple of challenges presented when using this pattern:
The call to [catalog].[start_execution] is asynchronous by default. What this means is that when you execute a package using this stored procedure, the stored procedure runs just long enough to kick off the execution, and will return almost immediately. If the package runs for several minutes or hours, the stored procedure still returns as soon as the execution begins. Although there are cases where this might be the desired behavior, an asynchronous execution is usually the exception.
The call to [catalog].[start_execution] succeeds even if the package fails. As long as there are no syntactical or structural issues – such as an invalid execution ID, a required parameter that is not specified, or insufficient permissions on the package – the execution of catalog.start_execution will succeed. As noted above, there are cases where this might be the intended behavior, but this design is not common.
Fortunately, these issues are not show-stoppers. There are workarounds for each of these to improve the program flow of SSIS execution.
A Better Way
For the majority of the ETL processes I develop, I use the following methods to work around the limitations noted above.
Execute Packages Synchronously
This one is a relatively easy fix. Although the default behavior is to execute [catalog].[start_execution] asynchronously, there is a system parameter that lets you override this behavior on a per-execution basis. The parameter name is SYNCHRONIZED, and it expects a value of either 0 (not synchronized, the default behavior) or 1 (synchronized). When overriding this value and setting it to 1, the call to [catalog].[start_execution] will not return until the package has completed its execution. This means that [catalog].[start_execution] will run as long as the package is still running. The syntax to add this parameter is shown below.
As shown, parameter values (including system, project, and package parameters) are added after [catalog].[create_execution] and before [catalog].[start_execution]. In the above example I set the SYNCHRONIZED system parameter to 1, forcing [catalog].[start_execution] to wait until the package completes before it returns.
This override solves one problem but leaves another: Even though [catalog].[start_execution] now runs synchronously, it does not report the status of the package execution. Most workflows need to know the status of the package execution, not just that the execution has finished. Fortunately, there is a workaround for that as well.
Force a Script Failure if the Package Fails
This workaround requires more than a simple execution parameter, so it has more moving parts. In a nutshell, here is the design pattern for forcing the T-SQL script to fail in the event that the package it has called fails:
- Create the execution
- Set the SYNCHRONIZED parameter to 1
- Execute the package
- Check the execution log for this execution, and force the T-SQL script to fail if the package failed
The previous script already took care of steps 1-3, so we just need to add the final step. This can be as simple or as complicated as needed, but in my example I just check [catalog].[executions] and retrieve the [status] value for that execution; if the value does not equal 7 (the status ID for Success), the script will generate a failure using RAISERROR.
Using T-SQL stored procedures is the most flexible way to execute catalog-deployed SSIS packages, but it does have some limitations. In this post I have demonstrated how to work around the issue of synchronicity and status reporting when executing catalog packages from T-SQL.