File-based ETL is usually dull. Most systems generate (or expect to consume) files that are delimited, with a common field separator such as comma, tab, or pipe. However, occasionally you’ll get an oddly formatted file with an unusual delimiter. Although it’s not obvious in the Visual Studio designer, SSIS is capable of consuming and generating files with custom delimiters.
In this short post, I’ll demonstrate how to use custom file delimiters in SSIS flat file connections, which can be used to read or write files with a nonstandard field delimiter.
Custom File Delimiters in SSIS
I’ll set up a business case: I have some data in a table that needs to be exported to a flat file. The data contains literal commas, tabs, pipe symbols, and quote characters, making it difficult to use a standard delimiter. I need to find a really unusual character which should never appear in the actual data, to avoid consistency issues. I choose the negation character (¬), which is rarely found in literal data. When I create the SSIS data flow to generate this file, I can easily walk through creating the flat file connection in the UI, but the list of available delimiters is very short – and the negation character isn’t on it.
Only a few options are shown, and they are all vanilla – and are of no help to me since my data contains these literal characters.
Fortunately, this is an easy fix. This isn’t a constrained drop-down list! You can type or paste in any value you want. In this example, I’ll just paste in the negation character as the delimiter.
The resulting output uses the custom delimiter in the output file.
Because the delimiter field is a free-text field (even though it appears to be a constrained drop-down list), you can use any character you want. In fact, you can use multiple characters, if you need that level of protection against accidentally breaking on a character that is intended as literal data. I’ve done this a few times in the past, using a combination two or three rarely-used characters as a delimiter in cases where the data is so disparate that special characters sometimes appear as literals.
This works well for both import and export operations. Because the delimiter is defined on the flat file connection, the same delimiter will be used for any file read from or written to by that connection.
The SSIS UI is very powerful but occasionally quirky. In this case, it seems to hide the fact that you can use any custom delimiter you wish for flat file imports or exports. Simply typing or pasting in the desired character(s) makes using custom file delimiters in SSIS a very easy process.
Never knew this. Good article.
Thanks, Thomas! Glad it helped.
Thanks for pointing another special character to use 🙂
Question about importing such files (please redirect me if it has been answered before). In a file I have to import, all the delimiters are there meaning not a ragged type of delimited file but the last column is sometimes BLANK, I’ve run into the problem that SSIS interpret that last field as having [CR][LF] and error out the row as invalid character for the destination field. How can I handle such case?
Currently I do a Redirect Row into an error.txt file…
Tien, I’ve not found that to be an issue in SSIS. As long as the line ending character (either CR, CRLF, or LF) is consistent throughout the file, it should work well even if the last field is blank.
I have seen issues with older versions of SSIS where a missing field at the end (not just a blank one, but one or more missing column delimiters) will cause issues. You can get around that by using the method I describe in this post: https://www.timmitchell.net/post/2013/01/14/ragged-flat-file-processing-in-ssis/
Have you ever tried to use a double-byte character delimiter? We tried to use a cedilla (Ç) with UTF8 and the header appears correct but the delimiters in the data do not use the double-byte char like the header does.