I am trying to load data from text files to database. My source files contain null character NUL
somehow (Picture1).
I just make all the fields as one column (delimited with {CR}{LF}
). Then I do the preview of the data.
The data is just what we need. But then when I run the package, the data changed, not like what I see in data preview. I added a data viewer to see the data.
The number 1 disappear in the first row (see the red). It seems that flat file reading ends at NUL
character. But my Row delimiter is {CR}{LF}
, it doesn't make sense the number 1 in the end disappear. Can anyone tell me why is that?
Figure 1 and Figure 2 show one example: cell I11 in the Excel file (Figure 1) appears as NULL in SSIS if you preview the file in SSIS (Figure 2). This happens because the driver samples the first eight rows of data and determines the data types of the fields (according to Excel Inserts Null Values by Allen Mitchell).
In this tutorial, I will show how to handle a NULL value loaded by SSIS from a CSV file. Note that some knowledge of Excel, SQL and programming is necessary. 1. Build a CSV file. Here’s a sample file: 2. Create a table in the SQL server database: 3. Start SSIS and create a new package.
This method is to use Script Task inside SSIS to open the Excel as a connection, query the data from the spreadsheet, and then load them to SQL. I had two Excel files to load, so I used a Foreach Loop Container to load the two files.
\x00 is an example of a specific byte value (HEX 0), that might be interpreted in a special way by a text reader. Wrong. ASCII files have NULL characters. In fact, every string in ASCII ends at a NULL. ASCII files are files that only contain ASCII characters x0 - x127.
First of all, I would like to show the steps to reproduce this error using Notepad++ editor.
I created a text file called TestNUL
that contains data similar to the screenshot posted in the question (commas are placed where NUL
objects should be):
Now, Go To Edit menu strip >> Character Panel
Now the ASCII character panel is shown, double click on the NULL
value in order to add it to the text:
Now the text file will looks like:
You can use the following link to download the file:
To remove this character you can simply open Notepad++, Click Ctrl + H to open the Find and Replace dialog. Then select to use Regular Expressions and replace \x00
with an empty string:
All NUL
characters are removed:
If you are looking to find and replace this character in multiple files, then you can use notepad++ to do this using Find in Files feature:
Since the issue occurs at run-time not while previewing data, you can simply add a Script Task before the data flow task to replace all \x00
values with an empty string. You can read the text file path from the flat file connection manager or you can store it in a variable. You can use a similar C# code:
public void Main()
{
string FilePath = Dts.Connections["SourceConnection"].ConnectionString;
string text = System.IO.File.ReadAllText(FilePath);
text = text.Replace(Convert.ToChar(0x0).ToString(), "");
System.IO.File.WriteAllText(FilePath, text);
Dts.TaskResult = (int)ScriptResults.Success;
}
If you are working with large text files then you can use System.IO.StreamReader
and System.IO.StreamWriter
classes to read the file line by line using ReadLine()
function.
I created a package and added two flat file connection manager, the source reads from TestNUL.txt
file and the destination create a new TestNUL_edited.txt
file with the same structure. I added a Script Task with the code above and added a data viewer in the Data Flow Task, the following screenshot shows how the rows are not corrupted:
Also the following screenshot shows how the NUL
values are removed from the source file after running the Script Task:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With