Nearly everyone uses the flat file source adapter, and nearly everyone that uses it runs into a common set of challenges. In Denali, we?ve made a couple of key enhancements to the Flat File Source to support the most commonly asked-for options:
- Delimited files with varying numbers of columns per row (aka ?Ragged-Right? delimited files)
- Delimited files with embedded qualifiers
Ragged-Right Delimited Files
A common way NULL data is represented at the end of rows in delimited files is without the column delimiter. Here?s an example:
OrderID | CustomerID | EmployeeID | ShipVia |
10248 | VINET | 5 | 3 |
10249 | TOMSP | 6 | |
10250 | HANAR | 4 | 2 |
When represented as a delimited file, it?s fairly common to see this kind of representation
OrderID,CustomerID,EmployeeID,ShipVia 10248,VINET,5,3 10249,TOMSP,6 10250,HANAR,4,2
Note the lack of a trailing comma after the ?6? in the second row. Customarily, most flat file parsers treat the lack of data on this row to mean there are NULLs in the remaining columns on the row.
Prior to Denali, SSIS took a different approach, where it would ignore any row delimiter until it believed it was parsing the last column of the row. So for example the flat file above might be parsed as:
OrderID | CustomerID | EmployeeID | ShipVia |
10248 | VINET | 5 | 3 |
10249 | TOMSP | 610250 | HANAR,4,2 |
Of course, since ?HANAR,4,2? isn?t a valid number, if ShipVia was an integer column, the parse would fail. And worse yet, there was little you could do to work around the limitation; your choices were to write your own parser, convert the data before parsing it, use the Flat File Source just to parse rows, or some such.
In Denali, this kind of file is parsed differently ? by default, we?ll always look for a row delimiter in unquoted data, and start a new row if it?s seen. So the table above is parsed as one would expect:

This new behavior is on by default, but can be disabled at any time using the AlwaysCheckForRowDelimiters property on the connection manager.
Embedded Qualifiers
Another custom in delimited files is the use of a qualifier character to ?escape? or embed a qualifier character into a qualified string, for example:
ID | Title |
1148 | Can?t Buy a Thrill |
1149 | Echoes, Silence, Patience & Grace |
To allow the literal commas to be part of the string, the use of qualifiers is typical in a flat file representing this data. However if an apostrophe(') is used for qualifying, the apostrophe in the first row must be escaped, and a typical way to accomplish this is by doubling the qualifier character:
ID,Title 1148,'Can''t Buy a Thrill' 1149,'Echoes, Silence, Patience & Grace'
Prior to Denali, such files would always fail to parse: though SSIS supported qualifiers, SSIS had no support for embedded qualifiers, and would treat the first qualifier character (in this case the first apostrophe after the ?n? in ?Can?t Buy a Thrill?) as the end of that field, and throw an error due to the lack of a delimiter immediately after the apostrophe.
In Denali, this is fixed, and the data above is properly parsed as:

SONUS NETWORKS SONIC AUTOMOTIVE SKYWORKS SOLUTIONS SILICON LABORATORIES SI INTERNATIONAL