I am scraping web data and some results are more than 8K of text. I have a field in my MS SQL Server 2008 R2 that is varchar(max) but RODBC only saves 8K and truncates the rest. I have DSN connections set up using both SQL Native Client 10 from SS R2 and ODBC 11.
sqlType returns -1 for the field title, indicating varchar(max)/text.
Both connections truncate the text at 8K. Is there a fix/workaround for this or am I limited to 8K in SQL Server? I can switch to MySQL if the R mySQL driver will pass the larger amounts of text through. I'm using R 3.0.1 and RODBC 1.3-6 on Win 7 x64.
Here is some sample code to replicate the problem.
CREATE TABLE
[test](
[title] [varchar](max) NULL
) ON [PRIMARY]
library(RODBC)
ch <- odbcConnect("text", uid = "X", pwd = "X")
sqlTypeInfo(ch,"test")
testtext = data.frame("salasjflsjsf")
colnames(testtext) = "title"
## This works
sqlSave(ch,testtext,"test",append=TRUE, rownames=FALSE, colnames = FALSE, safer=FALSE, verbose = TRUE)
## text > 8K
testtext = data.frame(paste(rep("salasjflsjsf",5000),collapse=""))
colnames(testtext) = "title"
## use ODBC default type for title
sqlSave(ch,testtext,"test",append=TRUE, rownames=FALSE, colnames = FALSE, safer=FALSE, verbose = TRUE)
## this message from ODBC
## Query: INSERT INTO "test" ( "title" ) VALUES ( ? )
## Binding: 'title' DataType 12, ColSize 8000
## Parameters:
## no: 1: title salasjfl
## [*text removed*]
## lsjsfsalasjflsjsfsalasjflsjsfsalasjflsjsfsalasjflsjsfsalasjflsjsfsalasjflsjsf/***/
## Warning message:
## In odbcUpdate(channel, query, mydata, coldata[m, ], test = test, :
## character data 'salasjflsjsfsalasjflsjsfsalasjflsjsfsalasjfl
## try with varType set to text
sqlSave(ch,testtext,"test",append=TRUE, varType = "text",rownames=FALSE, colnames = FALSE, safer=FALSE, verbose = TRUE)
## try with varType set to varchar(MAX)
sqlSave(ch,testtext,"test",append=TRUE, varType = "varchar(max)",rownames=FALSE, colnames = FALSE, safer=FALSE, verbose = TRUE)
close(ch)
I changed the column type to text and refreshed the DSN connection. Text type is being picked up but now there is an error message:
chtest <-odbcConnect("test2", uid = "X", pwd = "X")
> sqlColumns(chtest,"test")
TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME DATA_TYPE TYPE_NAME COLUMN_SIZE BUFFER_LENGTH DECIMAL_DIGITS NUM_PREC_RADIX NULLABLE REMARKS
1 RSS dbo test title -1 text 2147483647 2147483647 NA NA 1 <NA>
COLUMN_DEF SQL_DATA_TYPE SQL_DATETIME_SUB CHAR_OCTET_LENGTH ORDINAL_POSITION IS_NULLABLE SS_IS_SPARSE SS_IS_COLUMN_SET SS_IS_COMPUTED
1 <NA> -1 NA 2147483647 1 YES 0 0 0
SS_IS_IDENTITY SS_UDT_CATALOG_NAME SS_UDT_SCHEMA_NAME SS_UDT_ASSEMBLY_TYPE_NAME SS_XML_SCHEMACOLLECTION_CATALOG_NAME
1 0 <NA> <NA> <NA> <NA>
SS_XML_SCHEMACOLLECTION_SCHEMA_NAME SS_XML_SCHEMACOLLECTION_NAME SS_DATA_TYPE
1 <NA> <NA> 35
> sqlSave(chtest,testtext,"test",append=TRUE, varType = "text",rownames=FALSE, colnames = FALSE, safer=FALSE, verbose = TRUE)
Query: INSERT INTO "test" ( "title" ) VALUES ( ? )
Binding: 'title' DataType -1, ColSize 2147483647
Error in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, :
'Calloc' could not allocate memory (18446744071562067968 of 1 bytes)
In addition: Warning messages:
1: In odbcUpdate(channel, query, mydata, coldata[m, ], test = test, :
Reached total allocation of 6013Mb: see help(memory.size)
2: In odbcUpdate(channel, query, mydata, coldata[m, ], test = test, :
Reached total allocation of 6013Mb: see help(memory.size)
You defined the text column to be of type varchar(max)
, so it will be stored that way, regardless of specifying varType
when saving the data.
Try creating a table with a TEXT
column instead, then using sqlSave
. This way the data will be stored out of row context and you'll shouldn't have the truncation issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With