The SSIS Object Variable and Multiple Result Sets

In my most recent post in this series, I talked about how to use the SSIS object variable as an ADO recordset as a source in a data flow. By loading the result set of a query into this variable, the contents of the variable can be read by an SSIS script component and sent out through the SSIS pipeline.

In working up the example in the previous blog post, I wondered to myself: What happens if the source query returns more than one result set?

Multiple Result Sets with the Execute SQL Task

In reality, data retrieval queries are typically configured to return exactly one result set. If two different result sets are needed, two separate queries are built, each with its own source-to-target flow of data. However, I’ve seen more than a few queries (typically in the form of vendor stored procedures, which, of course, cannot be edited) which return more than one set of results in a single execution. Even some of the built-in SQL Server stored procedures do this (sp_help comes to mind). From wherever they come, any ETL process that consumes data generated by these types of queries or stored procedures must be configured to handle multiple result sets.

Thinking through the previous example, I wanted to experiment with what would happen if the query in an execute SQL task returns more than one result set. To test this, I created a stored procedure that purposefully returns two different result sets, as shown below:

image

Predictably, the stored procedure above returns two sets of results.

image

To continue the test, I’ve added an execute SQL task to the control flow, using the stored procedure above as the query. I set the ResultSet to Full Result Set.

image

Specifying a ResultSet value of Full Result Set requires the use of an object-typed variable to store the results.

image

Next, I’ll use a script component, configured as a source, to programmatically extract the information from the ADO recordset object. In my last post, I used an OleDbDataAdapter to fill a DataTable with the data from the object variable. In this case, I’ve changed that just a bit, instead populating a DataSet object which should allow us to capture more than one result set (since the DataSet can contain one or more DataTables). I’ve also added a second output, setting up the appropriate columns for the added result set. The snippet below shows the example of looping through each result set to build the data for each output. (Teaser: Before you try to execute this code, don’t execute this code. It won’t work. I’ll explain why below.)

// Set up the DataAdapter to extract the data, and the DataSet object to capture both output tables
OleDbDataAdapter da = new OleDbDataAdapter();
DataSet ds = new DataSet();

// Extract the data from the object variable into the DataSet
da.Fill(ds, Variables.vResults, "myTable");

// Process the first result set
foreach (DataRow dr in ds.Tables[0].Rows)
{
    // Create a new, empty row in the output buffer
    SalesOutputBuffer.AddRow();

    // Now populate the columns
    SalesOutputBuffer.SalesOrderID = int.Parse(dr["SalesOrderID"].ToString());
    SalesOutputBuffer.RevisionNumber = int.Parse(dr["RevisionNumber"].ToString());
    SalesOutputBuffer.OrderDate = DateTime.Parse(dr["OrderDate"].ToString());
    SalesOutputBuffer.ShipDate = DateTime.Parse(dr["ShipDate"].ToString());
    SalesOutputBuffer.Status = int.Parse(dr["Status"].ToString());
    SalesOutputBuffer.TotalDue = decimal.Parse(dr["TotalDue"].ToString());
}        

// Process the second result set
foreach (DataRow dr in ds.Tables[1].Rows)
{
    // Create a new, empty row in the output buffer
    SalesDetailOutputBuffer.AddRow();

    // Now populate the columns
    SalesDetailOutputBuffer.SalesOrderID = int.Parse(dr["SalesOrderID"].ToString());
    SalesDetailOutputBuffer.SalesOrderDetailID = int.Parse(dr["SalesOrderDetailID"].ToString());
    SalesDetailOutputBuffer.OrderQty = short.Parse(dr["OrderQty"].ToString());
    SalesDetailOutputBuffer.LineTotal = decimal.Parse(dr["LineTotal"].ToString());

}

When I update the script component source with this code and execute the package, it fails with the following error:

image

This scripting error is a bit cryptic, but it tells me all I need to know: “Cannot find table 1”. The second code block, which attempts to iterate over ds.Tables[1].Rows, fails because there is no table at index 1 (the second table, or what should be the second result set). I can infer from this that the execute SQL task has loaded only the first result set into the SSIS object variable, even though the stored procedure returned two different result sets. If I comment out the section that attempts to process the second table, the package runs fine and loads just the data from the first result set.

So if the execute SQL task does work for this scenario, how exactly does one retrieve data from a stored procedure or query which produces multiple result sets? The easiest method, as it turns out, is to use the script component to connect directly to the database and execute the query.

Handling Multiple Result Sets in the Script Component

All other things being equal, the simplest approach is usually the best. In this case, I can eliminate the middleman entirely (the execute SQL task) and connect directly to the source database from within the script component, retrieving the necessary data from all of the result sets. In testing this solution, I came across an excellent article written some years back by Bob Pearson, in which he describes and demonstrates (in VB.NET) the solution to the multiple result set quandary. With Bob’s permission, I have adapted the design pattern he described to C#.

In this pattern, the entire source operation – querying the source database and splitting out the result sets into their appropriate outputs – is handled in the script component. I should point out that this example uses the ADO.NET connection manager rather than the OleDB connection manager. Although it is technically possible to access the OleDB connection manager in script, the syntax to do so is a bit quirky, so I’ll keep it simple and use ADO.NET.

#region Namespaces
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
#endregion

// Add in the appropriate namespaces
using System.Data;
using System.Data.SqlClient;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    public override void CreateNewOutputRows()
    {
    // Use the existing package connection manager
        IDTSConnectionManager100 connMgr = Connections.AdventureWorksSource;
        SqlConnection conn = (SqlConnection)connMgr.AcquireConnection(null);

    // Execute the stored procedure that will generate the two result sets
        SqlCommand cmd = new SqlCommand("EXEC [dbo].[usp_GetSalesData]", conn);
        SqlDataReader reader = cmd.ExecuteReader();

        while (reader.Read()) // Read() will return false once we've processed all rows in the first result set.
        {
            // Create a new, empty row in the output buffer
            SalesOutputBuffer.AddRow();

            // Now populate the columns
            SalesOutputBuffer.SalesOrderID = int.Parse(reader["SalesOrderID"].ToString());
            SalesOutputBuffer.RevisionNumber = int.Parse(reader["RevisionNumber"].ToString());
            SalesOutputBuffer.OrderDate = DateTime.Parse(reader["OrderDate"].ToString());
            SalesOutputBuffer.ShipDate = DateTime.Parse(reader["ShipDate"].ToString());
            SalesOutputBuffer.Status = int.Parse(reader["Status"].ToString());
            SalesOutputBuffer.TotalDue = decimal.Parse(reader["TotalDue"].ToString());
        }

        // The NextResult() function moves us to the next result set
        reader.NextResult();

        // Process the second result set
        while (reader.Read())
        {
            // Create a new, empty row in the output buffer
            SalesDetailOutputBuffer.AddRow();

            // Now populate the columns
            SalesDetailOutputBuffer.SalesOrderID = int.Parse(reader["SalesOrderID"].ToString());
            SalesDetailOutputBuffer.SalesOrderDetailID = int.Parse(reader["SalesOrderDetailID"].ToString());
            SalesDetailOutputBuffer.OrderQty = short.Parse(reader["OrderQty"].ToString());
            SalesDetailOutputBuffer.LineTotal = decimal.Parse(reader["LineTotal"].ToString());
        }
        
    }
}

After making this change to bypass the SSIS object variable method and simply query the database directly from the script component source, the script component source runs successfully. I can then send the two outputs, each containing a distinct result set from the source stored procedure, to their respective output tables.

image

Tl;dr: In the rare cases when you have to process a query with multiple result sets, skip the SSIS variable and use a direct connection through the script component source.

Thanks again to Bob Pearson for allowing me to adapt his VB.NET code into C# for this example.

About the Author

Tim Mitchell
Tim Mitchell is a data architect and consultant who specializes in getting rid of data pain points. Need help with data warehousing, ETL, reporting, or training? If so, contact Tim for a no-obligation 30-minute chat.

12 Comments on "The SSIS Object Variable and Multiple Result Sets"

  1. https://msdn.microsoft.com/en-us/library/cc280492(v=sql.100).aspx. use ADO.net as connection manager instead of OLEDB. it worked for me in 2008.

  2. Thanks Tim This helped me a lot

  3. This is a great post! I have two outputs in my sp and I set up both in the Inputs and Outputs of the script task as well as in the script itself. I am trying to run it but get an error message:

    Error: 0xC0047062 at Data Flow Task Load MedRec sp, Script Component Run SP [134]: System.FormatException: String was not recognized as a valid DateTime.

    How do I catch which particular DateTime column gave me an error since the error did not mention it?

    • Unfortunately, finding out which column(s) caused an error is a bit clunky in SSIS. You could include some try/catch logic in your code to capture the name of the column. It takes some extra code to do this (especially if you have many columns to process) but it can help a lot with troubleshooting.

      Just FYI, SSIS in 2016 is supposed to make getting the error column name quite a bit easier. However, I’ve yet to test how this would work in the script component.

  4. Thomas Evers | May 2, 2017 at 2:42 pm | Reply

    I believe that I encountered this multiple result set error accidentally. I wanted to delete rows in a destination table where their datekeys were within the date range from my sort table:
    DECLARE @MinDateKey,@MaxDateKey INT

    — query to set variables, but which inadvertently adds an resut set.
    SELECT @MinDateKey = MIN(DateKey),@MaxDateKey = MAX(DateKey)
    FROM StagingTable

    — The stored procedure then runs another query that ssis parses and logs which results in a error message that is misleading at best.
    SELECT VariousReturnValuesThatAreAttemptedToBeParsedBySSIS

    Fortunately I found this article and changed the above select to a pair of SETs to resolve the problem.

    Thanks Tim!

  5. Where does SalesOutputBuffer come from? How is it defined and where?

    • It took me a second to figure this out. I believe it is defined in the Inputs and Outputs of the Script Component. I believe you’d also want to add the columns in here that it is pulling in.

  6. Does the connection manager for the source data have to have AllowMultipleResultsets=True set? Are we making that assumption in this article?

  7. Hi Tim. Very grateful to you for this excellent solution to the multiple result sets query. It helped me great deal in my work.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.