It has been well-documented through myriad blogs and forum posts about the case sensitivity of the comparisons in the SSIS lookup transformation (a good review can be found here). In a nutshell, a comparison using the lookup transformation is case sensitive when using the default setting of Full Cache, even if the values in the database are stored in a case insensitive collation. This happens because, in Full Cache mode, the comparison is done by the SSIS engine rather than the database engine, the former of which differentiates between “Value”, “VALUE”, and “VaLuE” regardless of the database collation setting. But did you know that space sensitivity in SSIS lookups matters as well?
In this post, I’ll discuss how space sensitivity in SSIS lookups can have an impact on how data is matched using that transformation.
Space Sensitivity in SSIS Lookups
Consider the following T-SQL code:
As you would expect, executing this code results in a successful match (INNER JOIN), even though we’re comparing CHAR values of differing lengths (for more information, see this article for more information about spaces and padding in SQL Server).
However, when the same comparison is run through an SSIS lookup transformation in Full Cache mode, the lookup on our sample data will fail. Similar to the case sensitive lookup, you’ll find that the SSIS engine would treat the strings ‘Hello World’ and ‘Hello World ‘ (note the trailing spaces) as dissimilar values. Unlike in SQL Server, trailing whitespace is significant in SSIS value comparisons.
As a workaround, you can use the TRIM() function in SSIS and the RTRIM() T-SQL function to insure that your comparisons are ignorant of trailing whitespace. Alternatively, you could use a cache mode other than Full Cache, but you should be aware of the other implications before making such a change.
Space sensitivity in SSIS lookups can change the expected behavior of your data flows, which could lead to some hard-to-detect problems. Be sure that you are aware of how the SSIS lookup transformation handles spaces in data comparisons, and take the time to build the necessary check logic to avoid issues.