Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SSIS Flat file could not deal with NUL (\x00) value?

I am trying to load data from text files to database. My source files contain null character NUL somehow (Picture1).

Picture1

I just make all the fields as one column (delimited with {CR}{LF}). Then I do the preview of the data.

Picutre2

The data is just what we need. But then when I run the package, the data changed, not like what I see in data preview. I added a data viewer to see the data.

Picture3

Picture4

The number 1 disappear in the first row (see the red). It seems that flat file reading ends at NUL character. But my Row delimiter is {CR}{LF}, it doesn't make sense the number 1 in the end disappear. Can anyone tell me why is that?

like image 822
morgan117 Avatar asked Jun 27 '13 07:06

morgan117


People also ask

Why does my Excel file appear as null in SSIs?

Figure 1 and Figure 2 show one example: cell I11 in the Excel file (Figure 1) appears as NULL in SSIS if you preview the file in SSIS (Figure 2). This happens because the driver samples the first eight rows of data and determines the data types of the fields (according to Excel Inserts Null Values by Allen Mitchell).

How to handle null value loaded by SSIs from CSV file?

In this tutorial, I will show how to handle a NULL value loaded by SSIS from a CSV file. Note that some knowledge of Excel, SQL and programming is necessary. 1. Build a CSV file. Here’s a sample file: 2. Create a table in the SQL server database: 3. Start SSIS and create a new package.

How to load data from Excel to SSIs?

This method is to use Script Task inside SSIS to open the Excel as a connection, query the data from the spreadsheet, and then load them to SQL. I had two Excel files to load, so I used a Foreach Loop Container to load the two files.

What does \x00 mean in a file?

\x00 is an example of a specific byte value (HEX 0), that might be interpreted in a special way by a text reader. Wrong. ASCII files have NULL characters. In fact, every string in ASCII ends at a NULL. ASCII files are files that only contain ASCII characters x0 - x127.


1 Answers

Reproducing the error

First of all, I would like to show the steps to reproduce this error using Notepad++ editor.

I created a text file called TestNUL that contains data similar to the screenshot posted in the question (commas are placed where NUL objects should be):

enter image description here

Now, Go To Edit menu strip >> Character Panel

enter image description here

Now the ASCII character panel is shown, double click on the NULL value in order to add it to the text:

enter image description here

Now the text file will looks like:

enter image description here

You can use the following link to download the file:

  • TestNUL.txt

Removing NUL character using Notepad++

To remove this character you can simply open Notepad++, Click Ctrl + H to open the Find and Replace dialog. Then select to use Regular Expressions and replace \x00 with an empty string:

enter image description here

All NUL characters are removed:

enter image description here

Find and replace in multiple file

If you are looking to find and replace this character in multiple files, then you can use notepad++ to do this using Find in Files feature:

  • How to find and replace line(s) in multiple files using Notepad++?
  • How to Find and Replace Words in Multiple Files

Automating the process Within SSIS

Since the issue occurs at run-time not while previewing data, you can simply add a Script Task before the data flow task to replace all \x00 values with an empty string. You can read the text file path from the flat file connection manager or you can store it in a variable. You can use a similar C# code:


public void Main()
{
    string FilePath = Dts.Connections["SourceConnection"].ConnectionString;

    string text = System.IO.File.ReadAllText(FilePath);
    text = text.Replace(Convert.ToChar(0x0).ToString(), "");
    System.IO.File.WriteAllText(FilePath, text);

    Dts.TaskResult = (int)ScriptResults.Success;
}

If you are working with large text files then you can use System.IO.StreamReader and System.IO.StreamWriter classes to read the file line by line using ReadLine() function.

  • How to read a large (1 GB) txt file in .NET?
  • How can I read, replace and write very large files?

Experiments

I created a package and added two flat file connection manager, the source reads from TestNUL.txt file and the destination create a new TestNUL_edited.txt file with the same structure. I added a Script Task with the code above and added a data viewer in the Data Flow Task, the following screenshot shows how the rows are not corrupted:

enter image description here

enter image description here

Also the following screenshot shows how the NUL values are removed from the source file after running the Script Task:

enter image description here

References

  • Notepad++ showing null values after crash
  • How to Insert a Null Character (ASCII 00) in Notepad?
  • What does \x00 mean in binary file?
  • Find/Replace nul objects in Notepad++
  • Removing "NUL" characters
  • How to Find And Replace Text In A File With C#
like image 169
Hadi Avatar answered Oct 18 '22 04:10

Hadi