Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flat file connection manager carriage return/line feed discrepancy

I'm upgrading from SQL Server 2008R2 to 2017, and making the same jump with SSIS. There are a number of flat file imports that are picking up files that have carriage return/line feeds ({CR}{LF}) embedded within a column in the row.

The 2008R2 flat file connection manager ignores the embedded {CR}{LF}s that are within a row, but the flat file connection manager in 2017 is treating each {CR}{LF} as a new line. It's the same in an upgraded connection manager or a brand new one that I make from scratch.

In both versions, the connection managers have the same specs:

General Tab

  Locale: English (United States)
  Unicode: No
  Code page: 1252 (ANSI-Latin I)
  Format: Delimited
  Text qualifier: <none>
  Header row delimiter: {CR}{LF}
  Header rows to skip: 0
  Column names in the first data row: Check

Columns Tab

  Row delimiter: {CR}{LF}
  Column delimiter: Vertical Bar{|}

The not particularly complicated text file I'm testing with:

row_id|row_data|empty_column|created_by|one_more_field{CR}{LF}
1|random test data||ebrandt|{CR}{LF}
2|Data field with a carriage return{CR}{LF}
and a line feed embedded in it.||ebrandt|

I pasted on the line terminators, just to show that they're there.

On the Columns tab, the Preview window in BIDS 2008R2 shows two rows:

2008R2_connectionmgr

But in 2017, exactly the same file gets broken into three rows:

2017_connectionmgr

There isn't bandwidth in this project to rework all the file imports.

Is there a setting that got changed between versions that I can change back? Or is there another trick to this that I'm missing?

Edit: In response to a comment that's been deleted, I would specify a text qualifier if I could, but the files I'm getting don't have any.

like image 637
Eric Brandt Avatar asked Oct 17 '22 08:10

Eric Brandt


1 Answers

In the Flat File Connection Manager you need to set the AlwaysCheckForRowDelimiters property to False.

Your file will then be parsed as before.

This was a change made in 2012 to change behaviour to the following.

By default, the Flat File connection manager always checks for a row delimiter in unquoted data, and starts a new row when a row delimiter is found. This enables the connection manager to correctly parse files with rows that are missing column fields.

See this link for more about it.

like image 65
Martin Smith Avatar answered Oct 21 '22 06:10

Martin Smith