I am using the RODBC
package to query a text column from a database. The database is built on Microsoft SQL Server 2008 R2. The data type of the column in SQL is nvarchar(max)
.
However, when I run:
# Set up ODBC connection to CCWEB5 production server
# Note: default database is set to "CCSalary"
ccweb5.prod <- odbcConnect("ccweb5")
# Read in some job ad text
job.text <- sqlQuery(ccweb5.prod,"
SELECT TOP 100
ja.JobTitle,
ja.JobText as 'JobText',
LEN(ja.JobText) as 'JobTextLength'
FROM JobStore.dbo.JobAd as ja (NOLOCK)
")
Within SQL, I am expecting (for the top row):
JobTitle JobText JobTextLength
IT Field Service Technician <text goes here...> 2742
However, when I do: nchar(as.character(job.text[1,2]))
It returns: 255.
So my question is, what is causing this truncation and how do I avoid it? Thanks!!
VARCHAR(255) stores 255 characters, which may be more than 255 bytes.
You should only use VARCHAR(MAX) columns when you know some of the data you are going to store exceed the 8000-byte limit of a VARCHAR(N) column, and all of the data is shorter than the 2 GB limit for the VARCHAR(MAX) column.
varchar [ ( n | max ) ] Use n to define the string size in bytes and can be a value from 1 through 8,000, or use max to indicate a column constraint size up to a maximum storage of 2^31-1 bytes (2 GB).
Like any other programming language, SQL also supports a gamut of data types that can hold integer data, date and time data, character data etc. and allows you to define data types of your own as well. SQL varchar is one of the best-known and most-used data types among the lot.
OK, so it seems that I have found a work-around to this. After some more Google'ing, I found that:
One thing to consider with the SQL Native Client ODBC driver is that VARCHAR(MAX) has does not have fixed size and the ODBC driver represents this by returning a max column size of 0. This can confuse your application if it doesn't check for 0 as a special case. See the bottom section of this article: http://msdn.microsoft.com/en-us/library/ms130896.aspx But in general I have not seen this happen with any of my .NET applications as it is handled properly in ADO.NET.
Source: http://bytes.com/topic/sql-server/answers/808461-cannot-read-varchar-max
So, in my case, the following did the trick:
job.text <- sqlQuery(ccweb5.prod,"
SELECT DISTINCT TOP 100
ja.JobTitle,
[JobText] = CAST(ja.JobText AS varchar(8000)), -- note the data-type re-cast
[JobTextLength] = LEN(ja.JobText)
FROM JobStore.dbo.JobAd as ja (NOLOCK)
")
Such that nchar(as.character(job.text[1,2]))
now returns 2742 (as it should).
I didn't see any similar questions on StackOverflow so I'll leave this up. Hope this helps somebody!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With