Resolving SSDT-BI Installation with IsVisualStudio2012ProInstalled() error

Recently I was building a new virtual machine for presentations, and loaded up my usual battery – Windows 7, SQL Server 2012, and SSDT-BI.  The installation of these pieces went as expected, and everything appeared to work well – until I attempted to execute an SSIS package on this new setup.  When I did so, I received the following cryptic error message:

image

Thinking that perhaps I had gotten a bad download for SSDT-BI, I started fresh again, downloading another copy of the bits and working through the machine build process, only to encounter the same error message again.

Fortunately, after just a bit of digging, I found that this is an issue that others have found when going through the same installation process.  I found this thread that points out that the issue occurs because one of the assemblies used by the SSIS designer has failed to properly register.  The solution is to re-register the assembly in question.  The exact steps to do so may vary depending on your OS version, the version of SQL Server, etc.  For my setup (Win7 x64, SQL Server 2012 Developer Edition w/ SP2, and SSDT-BI for Visual Studio 2012), the steps I used to correct the issue are below:

  • Open a command prompt window as administrator (right click the shortcut, and click Run As Administrator)
  • Change directories to C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0a\bin\NETFX 4.0 Tools\
  • Execute the following command: gacutil.exe /if “C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\PrivateAssemblies\Microsoft.SqlServer.Dts.Design.dll”

If the registration was successful, you’ll receive a confirmation message similar to below.

image

Thanks to TechNet user hazel_m_stu_o for the response on the thread mentioned above.

Four things I wish I’d known back then

In the blogging meme of the day, I was tagged by my friend Tim Costello to share four things I wish I’d known back when.  The only hard part was paring the list down to four items.

I think back to 15 years ago, when I was working retail and desperately seeking something else.  Something I could really get into.  Some kind of work I was passionate about.  On a dare, I took the A+ computer certification exam and passed it, and embarked on a career that has led me to where I am today.  Although I miss some of the aspects of that guy I was 15 years ago – bold, fearless, with boundless dreams and ambition – I also wish I could go back and teach my younger self a few things I’ve learned since then.

Failure is a part of growth.

I used to worry a lot more about failure.  I feared that a big failure would end my career, but also worried about day-to-day failures – small mistakes that everyone makes.  I worried that I’d say or do something stupid out of ignorance, and that would be the thing I would forever be remembered for.  It rarely works that way.  In most cases, a person’s career is not measured by their biggest misstep; rather, it is an aggregate of every success and failure – and the attitude they have about those successes and failures.  Failing on the job is a part of the growth process.  As Thomas Edison famously said, “I have not failed. I’ve just found 10,000 ways that won’t work.”  A misstep is only a failure if you don’t learn from it.

jackYou can’t be an expert in everything.

During my first year in IT, I was convinced that I was going to be some kind of technical demigod, and I set out to learn everything about everything – programming, databases, network administration, routing and switching, even Microsoft Access.  You name the technology from the late 1990s, and I probably had a book on it – and intended to master it.  There’s nothing wrong with learning, and to this day I still work regularly to learn about areas outside of my own specialization.  But I wish I’d known back then that maintaining deep expertise in so many different technical topics is exceedingly difficult if not impossible.  I wish I had realized that there’s far more demand for someone who does just a few things, but does them better than 98% of other practitioners of the same craft.

Don’t just have goals. Have a next set of goals.

There was a time fairly recently where I was stuck in the mud.  I was still doing the work I enjoyed, but for a time I was doing it without a real purpose.  Why?  I had met all of the career goals I had set years ago, and I was working largely without a personal charter.  While that’s a great problem to have, I really hadn’t planned for the contingency of completing those objectives.  I wish I could go back and tell myself to be a bit more optimistic about my goals, and to list out another set to be met later, and another after that, and so forth.  Like anything else, goals should be flexible enough to allow you to adapt to changes in the marketplace and your own desires and place in life.  Those goals will be different for everyone, but whatever they are for you, those milestones are critical for measuring progress and challenging yourself to press onward.

Your technical skills don’t matter as much as you think they do.

miltonIn the late 1990s and early 2000s, there was still a great deal of truth behind the antisocial geek stereotype.  One could be a reasonably successful technologist and stay hidden away from the front lines of end user interaction.  And I fell into that trap for a while, spending virtually no time honing my interpersonal or speaking skills and just working to refine my technical abilities.  While the ability to deliver technical solutions is important, being able to talk to people (both one on one as well as in a presentation setting) and understand their business and its pain points is critical for the success of the modern technologist.  I’d love to go back and tell my younger self: Continue to work on your technical skills, but get out of the server room occasionally and talk to people.  Learn about their jobs, and what causes them grief.  Understand how to not just build technical components, but how to solve real business problems.

And to keep things going, I’m going to tag a few people to get their input on this.  I’d like to hear what Marc Beacom, Reeves Smith, and John Sterrett have to say on this topic.

Mile High Tech Con

Mile High Tech ConThere’s a brand new business intelligence conference launching next month in Denver, Colorado.  The Mile High Tech Con is a three-day event taking place July 24-26, 2014, and is aimed at business intelligence practitioners, data analysts, and information managers/CIOs. 

From the event website:

Featuring three days of sessions and events focusing on Data Warehousing and Business Intelligence. Mile High Tech Con will highlight innovative topics that will inspire strategic decision making while offering relationship building and networking. Mile High Tech Con will foster collaboration and discussion so that attendees attain the opportunity to connect with industry experts and fellow developers while examining current and upcoming trends.

Among the topics to be covered at this conference:

  • Power BI
  • Predictive analytics
  • Biml
  • Master data management
  • Data quality
  • Dimensional modeling

There’s a great lineup of speakers, including Peter Myers, Marc Beacom, Reeves Smith, Bill Fellows, and others.  I even managed to convince them to let me speak as well!  I’ll be delivering three presentations:

  • Real World SSIS (full day seminar)
  • Error Handling Patterns in SSIS (two-hour deep dive)
  • Introduction to Data Quality Services (one-hour presentation)

This is the inaugural event for the Mile High Tech Con.  They have a model similar to that of SQLBits – one day of full-day seminars (Thursday), one day of deep-dive material (Friday), and one community day (Saturday).  Best of all, the community day is absolutely free.  Registration is currently open, and early bird pricing for Thursday and/or Friday is in effect through next week.

I’m looking forward to attending and presenting at this event.  If you’re going to be there, give me a shout and let’s get together for coffee.

SSIS Parent-Child Architecture in Package Deployment Mode

This is the second in a series of posts about SSIS parent-child architecture.  You can find the index page here.

In my first post in this series, I covered the essentials of SSIS parent-child design patterns, including discussing a number of advantages for using such patterns.  In this post, I’m going to demonstrate how to build parent-child structures using package deployment mode.  This deployment mode is available (though not the default behavior) in SSIS 2012 and 2014, and is the only deployment mode available in SSIS 2008 and earlier.  You might wonder why I might cover this deployment mode, since it’s no longer the preferred method for SSIS 2012 and 2014, and older versions that require package deployment mode are at least six years and two versions old.  I share this information for two main reasons.  First of all, even though package deployment mode is less commonly used than project deployment mode, there are many organizations still using it – even in recent versions of SSIS – due to legacy packages that were imported from older versions.  Additionally, it seems everywhere I go, folks are still running mission-critical packages on older versions of SSIS (especially 2008), and when I demonstrate features in SSIS 2012 or 2014, I’m frequently asked about the equivalent behavior in those older versions.

Under the package deployment model, there are generally three ways to execute one package from another package:

  • Execute Package Task – This is the built-in tool in SSIS that allows the developer to execute a package from another package
  • dtexec.exe – This is the command-line tool for executing packages
  • dtexecui.exe – This is the user-friendly version of dtexec, with a full UI for executing packages

Each of these three methods can be used to execute packages stored on either the file system or in MSDB.  Because this post addresses parent-child structures, which typically involves packages executing other packages, I’ll focus on the execute package task.

Execute Package Task

Using the execute package task in package deployment mode is a relatively straightforward process; simply drag the execute package task onto the control flow surface and set the configuration options.  As shown in the screenshot below, executing a package stored in MSDB requires only a connection to the SQL Server instance and the full path of the package to be executed.  Optionally, you can also specify the password (if the child package to be executed is password protected) and choose to execute the package in a separate process.

image

Similar options are available for a package deployed to the file system.  Note that you must create a file connection to connect to the package stored in the file system as shown below.

image

Under the Expressions tab (shown below), you can optionally configure dynamic values using SSIS expression language.  For example, you could add some logic that would dynamically specify the name of the package, which can be useful when building a work pile pattern (which I will cover in a future post).

image

Passing Values

So far, all straightforward, with very little to configure, right?  For a simple parent-child setup, this may be all you need to do.  But what if you want to make this arrangement more flexible – for example, you need to add some values to pass from the parent package to the child package.  A quick review of the execute process task reveals exactly zero hints on how to accomplish this.  By design, the parent package does not explicitly pass values down to child packages; rather, the child package declares – through package configurations – which variables it will consume from the parent.  So, for sharing variables from parent to child, the task list will be:

  • In the parent package, create and populate the variables to be shared with the child package
  • In the child package, create variables that will be populated by values from the parent
  • In the child package, create package configurations – one per variable – to populate the child variables from those in the parent

Creating the variables is very simple.  Start with the parent package, creating a couple of variables – named vIntVar and vStringVar – and setting their values, as shown below.

image

Next, in the child package (shown below), create two corresponding parameters which will be used to store the variable values from the parent package.  Note that in this example I have not assigned a default value to the String parameter, since the parent package will be supplying that value at runtime.  For this example, you’d still have to supply a default value for the Integer (Int32) parameter, because this data type requires a non-NULL default value.

image

That’s the easy part.  The more obscure and difficult part is adding in the configurations for these variables. When working in package deployment mode in SSIS 2014 or 2012 (or on any package on SSIS 2008 or 2005), package configurations are used to allow child packages to consume parameter values from the parent package.  The package configurations, defined on the child packages, specify which values are to be received from the parent package, and which variables those values should be written into.   To create the package configuration in the child package, right click on an empty space in the control flow and select Package Configurations.  You’ll need to click the checkbox to Enable package configurations, if you haven’t previously used configurations in this package.  The empty Package Configurations Organizer window is shown below.

image

To create a new package configuration, click the Add… button to open the Package Configuration Wizard window.  On the Configuration type setting, choose Parent package variable, and type in the name of the first variable to configure (vIntVar) as shown below.

image

Wait – I have to type in that variable name?  Why not have a drop-down list to let me choose it?  That’s an excellent question, and one that I too asked when I first started with package configurations.  The reason you have to type in the variable name is that the child package has no explicit relationship to the parent package.  Theoretically, this child package could be executed from any parent package – it’s not bound to any particular parent.

Once you’ve typed in the name of the parent package variable, click Next > and go to the next configuration page.  Here, you specify where to put this value from the parent package variable.  As shown below, you will use the package tree view to specify the variable to which you’re going to write this value.  Drill into the variable name, expand the Properties beneath that variable, and click on Value to select it as the target of this configuration assignment.

Just as a side note, if you drill into the folders below the list of variables, you can see that you don’t necessarily have to use an SSIS variable in the child package to capture the value from the parent package; you can write directly to other properties of the package (such as connection strings).  For a variety of reasons – most notably, easier troubleshooting – I recommend that you always write a value from a parent package configuration into an SSIS variable in your child package.  You can always assign that child package variable value to a built-in value – for example, a connection string – using an expression elsewhere in the package.

image

After selecting Variables –> vIntVar –> Properties –> Value, click Next > and supply a name for this configuration as shown below.  To keep things simple for this example, I’ve simply named the configuration with the same name as the variable (though you can give it a different name, if you prefer).

image

For the other variable (vStringVar), you can repeat this process, mapping the variable of that name in the parent to the variable of the same name in the child package.

With both configurations in place, the package is ready to be run for testing.  On the child package, I’ve created a script task that will open a Windows message box to show the variable values to confirm that they are coming from the parent package.  As shown below, the message box window from the script task in the child package confirms that the child package parameter values are supplied by the parent package.

image

As I discussed in my previous post, you now have a simple package infrastructure that lends it self to easier development and troubleshooting, streamlined error handling, and less code repetition.  It’s difficult to see with just two packages, but with an ETL system with dozens or hundreds of packages, the advantages of these parent-child patterns quickly become obvious.

Parting Thoughts

A few things to keep in mind when using the execute package task in package deployment mode:

  • The parent package variable and child package variables do not have to have the same names.  I kept the same names in this example for clarity, but there’s no technical requirement that you name those the same.  The binding between parent package variable and child package variable is established in the configuration, not by virtue of naming.
  • If you create a parent package variable configuration for a variable that does not exist in the parent package, the package execution will not fail for that reason; it will simply leave intact the default value (if any) for that child package variable.  Other failures might occur related to that missing parent package variable (for example, a missing connection string if you are using configurations to pass connection strings), but a missing parent package variable alone won’t cause a package failure.
  • The transfer of values from parent to child is a one-way transfer.  The information is passed by value, meaning that a copy of the value (not a reference to the original value) is given from parent to child.  Therefore, if you modify the value of the child package variable that was originally loaded from a parent package configuration, it has no impact on the original value in the parent package.  I will demonstrate in a later post in this series how to pass values by reference from the child package back to the parent package.

Conclusion

In this post, I’ve briefly demonstrated a simple pattern for implementing a parent-child architecture using the package deployment mode.  In my next post, I’ll go into depth on using this architecture in the project deployment mode in SSIS 2012 and 2014.

Speaking at the SQL PASS 2014 Summit

imageI’m happy to announce that I will be speaking at the SQL PASS Summit this fall.  The summit will be held during the first week of November in Seattle, Washington.  This will be my seventh year attending the PASS Summit, and my fourth year as a speaker.

This year I will be delivering a new presentation entitled “Building Bullet-Resistant SSIS Packages”:

It’s 2:30 a.m., and you’ve just gotten that call no ETL developer wants to get: an SSIS package has failed unexpectedly. Immediately, you start asking yourself: “What could have caused this?”, “What do I do to fix the data from this failed execution?”, and, most importantly, “Could I design the package in such a way that this doesn’t happen again?”

In this demo-packed session, we will examine the design patterns and logistics of building error-resistant SSIS packages. Starting with the control flow, we’ll work through various tools and patterns that can be used to prevent, or at least handle appropriately, task-level errors. We’ll then move to the data flow, and discuss and demonstrate how to proactively address problem data to prevent unexpected package failures.

I’m looking forward to presenting this session, but more importantly, I’m excited about meeting new community members and catching up with #sqlfamily.  I hope to see you in Seattle in November!

Upcoming Presentations in May

atlNext month, I’ll be making a couple of stops at SQL Saturday events in the south.  On May 3, I’ll be attending SQL Saturday #285 in Atlanta and presenting my SSIS Performance session.  I’ll be traveling with my good friend and neighbor Ryan Adams, who is also presenting at the event.  In addition, several of my Linchpin People cohorts are also in attendance, which means I should probably bring along some extra bail money.  Although I’ve been to Atlanta several times, I’ve never gotten to attend the SQL Saturday there, so I’m looking forward to meeting some new people.

SQLSAT308_SPEAKINGThe following weekend, I’ll be visiting the good folks in Houston for SQL Saturday #308.  At this event I’ll be sharing two of my favorite topics: SSIS performance and SSIS scripting.  Having previously attended SQL Saturday in Houston, I know a little about what to expect (Texas barbecue for lunch…. holla!), and I always dig hanging out with fellow Texans down south.  This event will be a first for me in that I’m taking along my whole family for the trip.

If you’re in Atlanta or Houston for either of these events, come by one of my sessions and say hello!  I hope to see you there.

SQL Saturday Lisbon

sqlsatportIt’s a little over a week until this year’s SQL Saturday festivities kick off in Lisbon, Portugal, and I’m very excited to be a part of it.  Registration is nearly full, so if you’re in the area and are planning on attending, register now!

For this event, I’m delivering a full day workshop entitled “Real World SSIS: A Survival Guide”, during which I’ll share design patterns and practical lessons I’ve learned over my 10-ish years in the BI/ETL space.  This workshop will be held on Thursday, April 10th (the Thursday prior to the main SQL Saturday event), and there are still some seats available.  You can register for this workshop online.  I’ve also recorded a teaser video of what’s to come in this workshop.

In addition to the full-day workshop on Thursday, I’ll also be presenting two, one-hour sessions on Saturday.  I’ll be sharing “Handling Errors and Data Anomalies in SSIS” and “15 Quick Tips for SSIS Performance” during the regular SQL Saturday event.

If you plan on attending SQL Saturday in Lisbon, please stop by and say hello!  I’m looking forward to seeing you there.

Lunch with someone new

shakeI met up for lunch with a good friend and former coworker today, and among the topics of discussion was how we as professionals often neglect personal relationships when work and life get busy.  I’ve found that to be especially true since I started working from home last year.  I don’t miss a lot about working in an office setting, but I do long for the days of hallway conversations and working lunches with colleagues. When working in isolation, it can be easy to get into cocoon-mode, shutting out the rest of the world – to the detriment of interpersonal skills and relationships.  Through my work as a professional presenter, I get to talk to a lot of people, but more often than not I’m talking to them in a group setting with little one-on-one interaction.  While the former is useful for building a list of contacts, it doesn’t do much to truly build relationships.

Five years ago, in January of 2009, I set a goal for myself to have lunch or drinks with someone new – not necessarily a stranger, but someone with whom I had not spent any one-on-one face time – on a monthly basis.  I exceeded that goal in a big way.  And I don’t think it’s an accident that 2009 and 2010 were two of the biggest growth years of my career. I didn’t land any work directly as a result of those relationships – in fact, several of the people with whom I met weren’t business associates but personal acquaintances. For me, the bigger benefit was to get out of my comfort zone and get to know more people on a personal basis, whether or not I saw a direct career benefit to meeting with them.  I firmly believe that, five years later, I’m still seeing benefits of getting out of that comfort zone.  And just as importantly, I had a lot of fun!

So I’m going to rekindle this goal.  Since it’s not January, I don’t have to call this a New Year’s resolution, but I’m going to commit to share a meal or drinks with someone new at least once a month (including this month) for the remainder of this year.  I’ll hope that I exceed the goal as I did in 2009.

If you’re not regularly spending face time with peers and acquaintances, I would encourage you to give it a try.  Go out for coffee with someone you meet at a professional event.  Have lunch with an acquaintance.  Even if it’s uncomfortable for you – no, especially if it’s uncomfortable for you – it can pay big dividends in the long run.

Parent-Child SSIS Architecture

This is the first in a series of technical posts on using parent-child architectures in SQL Server Integration Services.  The index page for all posts can be found here.

In this post, I will provide an overview of the architecture and describe the benefits of implementing a parent-child design pattern in SSIS structures.

Definition

The simplest definition of SSIS parent-child architecture is that it consists of packages executing other packages.  In SSIS, the package is the base executable; it is the most granular component that can be executed independently1.  Every version of SSIS includes the ability for one package to execute another package through the use of one of the following:

  • The Execute Package Task
  • T-SQL commands
  • The Execute Process Task (using the dtexec.exe utility)

Though the reference to parent-child architecture implies that there are exactly two layers to this design, that does not have to be the case.  You may have a design where a package executes a package which executes a package, and so forth.  Although there may be a hard limit to how deeply nested a parent-child architecture may go, I have never encountered such a limitation.  I have found it useful on a few occasions to go deeper than two levels in this type of architecture, particularly when designing a formal ETL framework (to be discussed further in a future post in this series).  In cases where greater than two levels exist, finding the right terminology for those layers is important.  You can refer to them by patriarchy (grandparent/parent/child) or by cardinality (level 1, level 2, level n), as long as you remain consistent – especially in your documentation – with those references.

Conceptually, a parent-child architecture is a form of code abstraction.  By encapsulating ETL actions into discrete units of work (packages), we’re creating a network of moving parts that can be developed, tested, and executed independently or as part of a larger collection.

Benefits

As I mentioned in my introductory post, there are several benefits to using parent-child structures in SSIS.

Reusability.  In any ETL environment of significant size or complexity, it’s quite normal to discover common ETL behaviors that are reusable across different implementations.  For a concrete example of this: In my spare time, I’m working on an ETL application that downloads XML files from a Major League Baseball web service.  There are files of various formats, and each file format is processed a different way, but with respect to the download of the files, I always perform the same set of operations: create a log entry for the file; attempt to download the file to the local server; log the result (success or failure) of the download operation; if the download has failed, set the HasErrors variable on the main package.  If I were to load this behavior into a group of tasks in the package for each XML format, I’d have five different copies of the same logic.  However, by building a parameterized child package that performs all of these core functions, I only have to build the file download/logging logic once, and execute the resulting package with the appropriate parameters each time I need to download a file.

Easier development.  Working with large and complex SSIS packages can be a pain.  The larger the SSIS packages, the longer it takes for the BIDS or SSDT environment to do its validation checks when the package is opened or modified.  Further, when multiple ETL developers are working on the same project, it is much easier to break apart the project into discrete units of work when using numerous smaller SSIS packages.

Easier testing and debugging.  When working through the test and debug cycles during and after initial development, it’s almost always easier to test and debug smaller packages.  To test a single task that resides in a large SSIS package would require either running the task by itself manually in the Visual Studio designer, or disabling all of the other tasks and redeploying the package.  When working with packages that each perform one unit of work, one can often simply execute the package to be tested through the normal scheduling/execution mechanism.

Clarity of purpose. An architecture that uses small, single-operation packages lends itself to clarity of purpose by virtue of naming.  When browsing a list of deployed packages, it is much more clear to see package names such as “Load Customers Table”, “Merge Product Table”, and “Remove Duplicates in Vehicle Table” than to find do-everything packages with names like “Load Production DB”, “Update DW”, etc.

Performance. In some cases, breaking out multi-step SSIS package can bring some performance gains.  One distinct case that comes to mind is using a distributed architecture, where packages within a single execution group are executed on multiple servers.  By distributing packages across different SQL Server machines (either physical or virtual), it may be possible to improve performance in cases where the processing load on a single SSIS server has become a bottleneck.  I want to emphasize that using a parent-child architecture does not arbitrarily improve performance, so this should not be used as a silver bullet to improve a poorly performing group of packages.

The Tools

As I mentioned earlier, there are three tools that can be used to execute a package from within another package.

The execute package task.  This is the easiest and most common means of executing a package from within another.  This task can trigger the execution of a package stored on the file system, deployed to MSDB or the SSIS catalog, or residing in the same project.  If using SSIS 2012 with catalog deployment mode, you can also use the execute package task to pass parameter values from the parent package to the child package.  It is important to note that the execute package task behaves differently in SSIS 2012 than it does in older versions.

T-SQL commands (via the execute SQL task).  For SSIS projects using project deployment model in SSIS 2012, the built-in stored procedures in the SSIS catalog can be used to execute packages.  This method for executing packages, like the execute package task, allows you to specify runtime parameters via T-SQL code.  One significant advantage of using T-SQL commands to execute packages is that, unlike the execute package task, you can use expressions to set at runtime the name of the package to be executed.  This is useful in cases where you are iterating over a list of packages that may not be known at runtime, such as a pattern found in ETL frameworks.

dtexec.exe (via the execute process task).  Using this method allows you to trigger package execution via the command-line application dtexec.exe.  Although this method is typically used to execute packages in a standalone environment – for example, when using third-party scheduling tools to orchestrate package execution – but dtexec can also be used within SSIS by way of the execute process task.  As an aside, I rarely use dtexec to execute child packages – in most cases, it’s easier to use either the execute package task or T-SQL commands to execute one package from within another.

I’ll also briefly mention dtexecui.exe.  This is a graphical tool that serves the same purpose as dtexec.exe, except that the former exposes functionality via a graphical user interface rather than forcing the user to use command-line parameters for configuration.  Except for this brief mention, I’ll not cover dtexecui.exe in this discussion of parent-child architecture, as that tool is intended for interactive (manual) execution of packages and is not a suitable tool for executing one package from within another.

Parent-Child architecture in the real world

To illustrate how this can work, let’s model out a realistic example.  Imagine that we have charge over the development of a sizeable healthcare database.  In addition to our production data, we’ve got multiple environments – test, development, and training – to support the development life cycle and education needs.  As is typical for these types of environments, these databases need to be refreshed from the production database from time to time.

The refresh processes for each of these environments will look similar to the others.  In each of them, we will extract any necessary data for that environment, retrieve and restore the backup from production, and import the previously extracted data back into that environment.  Since we are dealing with sensitive healthcare data, the information in the training database needs to be sufficiently anonymized to avoid an inappropriate disclosure of data.  In addition, our test database needs to be loaded with some test cases to facilitate testing for potential vulnerabilities.  Even though there are some differences in the way each environment is refreshed, there are several points of shared – and duplicate – behavior, as shown below (with the duplicates in blue).

ParentChild-Duplicate

Instead of using duplicate static elements, we can eliminate some code redundancy and maintenance overhead by encapsulating those shared behavior into their own container – specifically, a parameterized package.  In doing so, we can avoid having multiple points of administration when (not if) we need to make adjustments to those common elements of the refresh process.  The updated architecture uses parameters (or package configurations, if using package deployment mode in SSIS 2012 or any older version of SSIS) to pass in the name of the database environment to refresh.

ParentChild-HighLevel

As shown, we’ve moved those shared behaviors into a separate package (RefreshDB), the behavior of which is driven by the parameters passed into it.  The duplicate code is gone.  We now have just one SSIS package, instead of three, that need to be altered when those common behaviors change.  Further, we can individually test and debug the child package containing those common behaviors, without the additional environment-specific operations.

Note that we haven’t reduced the number of packages using this architecture.  The goal isn’t fewer packages.  We’re aiming for a modularized, easy-to-maintain design, which typically results in a larger number of packages that each perform just a few (and sometimes just one) functions.  In fact, in the parent-child architecture shown above, we could even further refine this pattern by breaking out the individual operations in the RefreshDB package into packages of their own, which would be practical for cases in which those tasks might be executed apart from the others.

Exceptions to the rule

Are there cases in which parent-child structures do not add value?  Certainly.  A prime example of such a case is a small, simple package developed for a single execution with no expectation that its logic will be reused.  I call these throwaway packages.  Because of their single-use nature, there is likely little value in going through the effort to building a parent-child architecture around their business logic.

Up Next

In my next post in this series, I’ll work through the mechanics of using a parent-child pattern in SSIS 2005 or SSIS 2008.

1 Technically, there are lower-level items in the SSIS infrastructure that can be executed independently.  For example, from the BIDS or SSDT design surface, one can manually execute a single task or container within a package.  However, when deploying or scheduling the execution of some ETL behavior, the package is the lowest level of granularity that can be addressed.

Edit: Corrected typo on one of the graphics.

SSIS and PowerPivot training in Baton Rouge

I’m happy to announce that I’ll be teaming up with my Linchpin People colleague Bill Pearson for a day of BI and SSIS training next month.  On Wednesday, February 12th, we’ll each be delivering a full-day presentation in Baton Rouge, after which we’ll be joining the Baton Rouge SQL Server User Group for their monthly meeting.

SSIS Training with Tim Mitchell

I’ll be presenting Real World SSIS: A Survival Guide, which is aimed at beginning-to-intermediate SSIS developers.  In this day-long training session, I’ll be sharing and demonstrating many of the ETL lessons that I’ve learned in my 10+ years working in the SQL Server business intelligence ecosystem.

At the same time, Bill Pearson will be delivering Practical Self-Service BI with PowerPivot for Excel, which will provide a crash course for those who are new to PowerPivot.  Following these day-long presentations, Bill will also share more on PowerPivot at the Baton Rouge SQL Server User Group that evening.

Registration for both of these day-long courses is currently open, and early-bird pricing is available for a limited time.  If you’re around the Baton Rouge area and are interested in learning more about SSIS or PowerPivot, we’d love to have you join us next month!