Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server on Linux > Bulk Import error

did you ever get this error when trying to bulk insert a .tsv file into sql-server-linux docker image:

Referenced external data source "(null)" not found.

Here's the command, table exists, file exists on server:

BULK INSERT countries FROM '/import/file.tsv'
WITH (
  FIELDTERMINATOR = '\t'
);

Thank you Thomas

like image 257
Tom Zayouna Avatar asked Dec 30 '16 09:12

Tom Zayouna


2 Answers

I had a lot of troubles with BULK INSERT on linux, where a valid path name was rejected:

# sqlcmd -S localhost -d leak -U sa -Q "BULK INSERT leak FROM '/tmp/data.txt'"
Msg 12703, Level 16, State 1, Server mssql, Line 1
Referenced external data source "(null)" not found.

# ll /tmp/data.txt
-rw-r--r-- 1 root root 30M Feb 20 02:40 /tmp/data.txt

And the suggested workaround to replace slashes with backslashes and prepending C:\ didn't work:

# sqlcmd -S localhost -d leak -U sa -Q "BULK INSERT leak FROM 'C:\\tmp\\data.txt'"
Msg 12703, Level 16, State 1, Server mssql, Line 1
Referenced external data source "(null)" not found.

So, I investigated a bit, and by running strace on the daemon I found this:

# strace -fp 3198 -e open
strace: Process 3198 attached with 175 threads
[pid  3202] open("/proc/self/status", O_RDONLY) = 170
[pid  3202] open("/proc/meminfo", O_RDONLY) = 170
[pid  3321] open("/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 175
[pid  3321] open("/tmp/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 175
[pid  3321] open("/tmp/data.txt", O_RDONLY) = 175
[pid  3321] open("/tmp/data.txt", O_RDONLY|O_DIRECT) = -1 EINVAL (Invalid argument)

The open() fails because O_DIRECT is not supported by the tmpfs filesystem. So I moved the file in the root and I made it world accessible:

# ll /data.txt
-rw-rw-rw- 1 root root 30M Feb 20 02:28 /data.txt

# ll /data.txt
-rw-rw-rw- 1 root root 30M Feb 20 02:28 /data.txt
# sqlcmd -S localhost -d leak -U sa -Q "BULK INSERT leak FROM '/data.txt'"
Msg 4860, Level 16, State 1, Server mssql, Line 1
Cannot bulk load. The file "/data.txt" does not exist or you don't have file access rights.

But this time the server doesn't even tries to access the file. Moving the file in a world accessible directory other than the root, fixed it:

# sqlcmd -S localhost -d leak -U sa -Q "BULK INSERT leak FROM '/media/data.txt'"

(1000000 rows affected)
like image 64
teknoraver Avatar answered Oct 21 '22 04:10

teknoraver


This is a bug in SQL Server on Linux with how we handle paths. We are still working through getting everything in SQL Server to handle Linux path syntax. I've filed a bug on this and we'll get it fixed. In the meantime, you can specify the path as C:\import\file.tsv where C:\ is a placeholder for the root of the Linux file system (i.e. '/') and the slashes are just reversed. I tested this exact scenario in a RHEL VM and it works fine. Pro tip: This C:\ trick will work anywhere that paths are passed in T-SQL so if you run into this kind of an issue for something else requiring paths, give it a try.

Bug #9380471 for Microsoft-internal reference.

like image 3
Travis Wright Avatar answered Oct 21 '22 02:10

Travis Wright