Tim Mitchell
Follow Tim Mitchell on Twitter  Like Tim Mitchell on Facebook  Subscribe on YouTube  Connect on LinkedIn  Subscribe to the Data Geek Newsletter

Accessing SFTP with SSIS

SFTP with SSISIf you’ve used SSIS for any significant amount of time, you’ve almost certainly run into this roadblock: You need to download a file stored on an SFTP (secure FTP) server, but connecting to SFTP with SSIS is not natively supported. There is a built-in FTP task in Integration Services, but it is very limited in its functionality and will not connect to an SFTP server. However, it is possible to build your own solution using free tools and a bit of code.

In this post, I’ll show how to access SFTP with SSIS by building a reusable package to download secure FTP files.

Possible Solutions

Before I dive into my solution, I’ll point out that there are several ways to solve the problem of downloading files stored on SFTP with SSIS:

  • Acquire a third-party SSIS task suited for this purpose (of which there are several free and paid versions)
  • Develop a purely code-based solution using a third-party DLL
  • Develop a code-based solution that calls a command-line SFTP application

I’ve used each of these solutions in the past, and each one has its own merits and risks. For example, using a third-party SSIS task provides a more streamlined approach (often requiring no manual coding at all), but often – especially in larger organizations – installing a third-party component requires so much extra testing and administrative approval that it makes it more trouble than its worth. For coding solutions using external DLLs, often the same issues exist because the external DLLs have to be registered on each server, which is an easy enough process but surprisingly difficult to get approved in many cases. In my work as a consultant, I’ve found that the easiest solution – when considering not just the technical components but the administrative work – is to use a command-line approach leveraging a mature and well-documented SFTP client. Therefore, my go-to solution for consuming SFTP with SSIS is the last one on the list above, which is the solution I describe herein.

SFTP With SSIS: The Tools

To develop and test this solution, I use just a couple of tools:

WinSCP. This is a mature, well-documented, and free (GPL) SFTP client. It has both a UI and a command-line component, though I use the latter exclusively for this solution. The portable executables (near the top of the downloads page) are the easiest to use, as you simply drop the two executables into a folder and start using them without going through an installation process.  There are, of course, other free command-line SFTP tools available, but I’ve found none better than WinSCP in terms of reliability, ease of use, and documentation. For more information about WinSCP, check out the documentation pages, which are quite comprehensive and can give you a better understanding of how the command-line WinSCP tool works and its myriad configuration options. In particular, this article was very helpful for some of the syntax questions I had when developing this solution.

Core FTP mini-SFTP server. Although this is not a part of the deployed solution itself, this provides a very simple and easy-to-configure SFTP server for testing purposes.

Also, in this example I’m using SSIS 2014 to demonstrate the solution, but this design pattern for accessing SFTP with SSIS is portable across all versions of Integration Services.

Setting Up the Environment

For the initial setup, download WinSCP and the Core FTP mini-SFTP server executables listed above. To keep things simple, I recommend creating a utility directory to store these executables. As noted, WinSCP comes with two executables – WinSCP.exe and WinSCP.com – but this solution requires only the latter of these for command-line operations.

image

Next, start up the mini-SFTP server (msftpsrvr.exe). The UI is quite simple and self-explanatory, only requiring a user name, password, port number (which defaults to 22), and the root directory.

image

For testing, add a few files to the root directory specified above. Be sure to also include a few files that will not be downloaded as part of the SFTP operation for more comprehensive testing.

Downloading From SFTP with SSIS

With the testing environment set up, it’s time to build a solution. In a new SSIS project, create a new package (I’ve named mine DownloadSFTP.dtsx). Navigate to the Parameters tab, where we’ll create a handful of runtime values that will make the DownloadSFTP package more reusable.

image

As shown above, all of the values that might change over time – the server name, user ID and password, download directory, etc. – are parameterized. This accomplishes two things. First, when those values do change (and they will), there is a single point of administration rather than a bunch of hard-coded values in the package. Even better, by parameterizing these values, you can build a package that is process-agnostic, and have just one package that can be used by multiple ETL processes.

For those using older versions of SSIS (prior to 2012), you won’t be able to use parameters. However, with a little extra work you can accomplish the same thing I’ve described here by using package configurations.

I’ll call out a couple of these values that may not be obvious:

  • pFilename: This is the file name to download from the server. Note that we can also use wildcards (assuming they are supported by the target server) – in the example above, we’ll be downloading all files ending in “.TXT”.
  • pServerHostKey: This is used to satisfy a security mechanism built into the WinSCP process. By default, WinSCP will prompt the user to verify and add to local cache the host key when connecting to an SFTP server for the first time. Because this will be done in an automated, non-interactive process, getting that prompt would cause an error in our script. To prevent this, the script is built to supply the server host key to avoid the error, and also has the added benefit of ensuring we’re actually connecting to the correct SFTP server. This brief article on the WinSCP documentation site describes how to retrieve the server host key for the target server.
  • pServerUserPassword: This is marked as sensitive to mask the password. As part of the script logic, this password will be decrypted before it is sent to the server.

With the parameters established, we can jump into the code portion of the solution. Create a new script task in the control flow, and add all 7 of the parameters shown above to the list of ReadOnlyVariables.

image

Next, we’ll add the code to set up the Process object that will ultimately trigger the execution of WinSCP.com to perform the download. As shown below, using the Main() function (which is created automatically in a new script task), we’re creating the Process object and configuring a few of the runtime options, including the name of the executable and the download directory.

The next step is to create the input strings that will make the connection and download the file. At the bottom of this snippet, there are 3 variables that will capture output messages, error messages, and the return value, all of which will be used to log runtime information.

With all of the options configured, it’s time to invoke WinSCP.com. The try/catch block below will attempt to connect and download the specified file from the server.

Technically, that’s all we have to do. However, I always like to build in enough information for logging that I know when and why a problem occurred. The final chunk of code below checks for error conditions – either a nonzero return code or a nonblank errStr value – and will log the message and fail the script task (thus failing the package) on error. If there is no error, we’ll simply use the Dts.Events.FireInformation() method to log the output from the execution of WinSCP.com.

That’s it! The package is ready to be executed. Assuming everything is configured properly, running the package on my system should download exactly two text files (remember, we used the wildcard “*.txt” to get all text files). And upon review, I find that’s exactly what happens when the package runs.

image

To briefly test out the error logging facilities when using this package to connect to SFTP with SSIS, I’m going to update the package parameter for User Name to an invalid value. Changing the user name to “userx” results in a package failure, exactly as we’ve designed it. When I review the log information, I can see that the log captures the fact that WinSCP.com failed, which should help determine that it is an authentication-related issue.

image

Usage in the Real World

This pattern takes a little effort to implement. Even if you copy verbatim the code I’ve used in this example, you’ll probably make some modifications, add behavior, etc. That takes time. However, this pattern has a significant advantage in that you could build this one time, and use it for all of your SFTP download operations. It’s built with reusability in mind, and with minor tweaks to suit your environment, you could use this package as a solution to all of your SFTP downloads, deploying it once (or at most, once per SSIS server) and continuing to reuse it.

Conclusion

Although accessing files stored on SFTP with SSIS is challenging, you can roll your own solution by using free tools and a little bit of script. In a future post, I’ll extend this pattern to demonstrate how to go the other direction and write files back to an SFTP server.

You can download the code from this solution here.

About the Author

Tim Mitchell
Tim Mitchell is a business intelligence and SSIS consultant who specializes in getting rid of data pain points. Need help with data warehousing, ETL, reporting, or SSIS training? Contact Tim here: TimMitchell.net/contact

20 Comments on "Accessing SFTP with SSIS"

  1. Looks great and exactly what I need but when I try and run it (after reconfiguring to my settings) a ‘Exception has been thrown by the target of an invocation’ error. Can’t see anything obvious causing it but must be missing something!

    • Phil, that’s one of the unfortunate downsides of using scripting – sometimes capturing the error message takes some extra effort in the script itself. You might need to add some logic to capture the exact error message.

  2. How on earth is it the case that Microsoft hasn’t included SFTP in their SSIS product in the year 2016!??!?! This is absurd…

  3. Vijaya Chappidi | August 26, 2016 at 8:37 am | Reply

    is there any sftp task which supports ssis 2016?

  4. thanks for your reply and this method in your article is awesome. But we just wondering is it possible to get the list of all file names from the sftp server before we download any file there?

  5. Hi Tim,

    What a great article.

    Just a quick highlight for line “winscp.WaitForExit();”, it MUST be after “winscp.StandardOutput.ReadToEnd()” when intended to read from output stream synchronously.

    Otherwise, when transferring a long list of files, it will cause “deadlock” due to Synchronous read operations introduce a dependency between the caller reading from the StandardOutput stream and the child process writing to that stream. In turn, the child process will be dependent on parent when it writes enough data to fill its redirected stream (due to long list of files, my experience).

    Please refer to MSDN article: https://msdn.microsoft.com/en-us/library/system.diagnostics.process.standardoutput(v=vs.110).aspx

    Regards,
    /Tommy

    • Tommy, thanks for the comment. You’re absolutely correct – although the code will work as originally shown, it hangs when trying to capture the outputs when the amount of text is more than just a few characters. I have updated the snippet to put WaitForExit() in the right order. Thanks again!

  6. Where Can i Find the Log information you just mentioned in a post .

    I am not getting a desired output so i wanted to see where my code failed.
    Please suggest

    • Rahul, the output messages would be written to the Progress window when executing from SSDT. When executing the package from the SSIS catalog, it would be written to the catalog logging table in SSISDB (catalog.event_messages, if I remember correctly).

  7. Hi Tim . I really don’t understand what process you create from the begining. the process object , what members or evens have inside?

  8. Hi Pedro, if you are referring to Process object used in the main{} method, it belongs to .net class

    The idea is to instantiate this object to host winscp program (.com) as a process, which then able to send key strokes to emulate user command line interaction.

    Do follow through this article. It’s the best article on SFTP Interface so far in term of simplicity using this method.

    Unless you wanted more control over the Interface, then you could explore winscp API.

    For myself, I had implemented the former method by logging the error or information into a table, in addition to logging them in SSIS log. Very grateful to Tim for sharing this!

    Regards,
    /tommy

  9. Tim, I am using SSIS 2016, after copy the downloaded code from this webpage, the script saying The type or namespace name ‘Process’ could be found (are you missing a using directive or an assembly reference? ) , the message is referring to the line in the script : Process winscp= new Process();

    Thank you!
    Lisa

  10. Excellent article Tim.It was very useful and helped me achieve the SFTP task

  11. Tim, I want to mention my Local destination File Name in the script. Where Can I do that?? Because my remote file has no extension I want to put that on my system adding the specific naming convention wit the file extension

  12. Lisa, try using System.Diagnostics;
    That worked for me

  13. Do i need to Include winscp with Includes , as i am getting invokation error on the first statement. Any help

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: