I would like to enter a data frame into an existing table in a database using an R script, and I want the table in the database to have a sequential primary key. My problem is that RODBC doesn't seem to allow the primary key constraint.
Here's the SQL for creating the table I want:
CREATE TABLE [dbo].[results] (
[ID] INT IDENTITY (1, 1) NOT NULL,
[FirstName] VARCHAR (255) NULL,
[LastName] VARCHAR (255) NULL,
[Birthday] DATETIME NULL,
[CreateDate] DATETIME NULL,
CONSTRAINT [PK_dbo.results] PRIMARY KEY CLUSTERED ([ID] ASC)
);
And a test with some R code:
ConnectionString1="Driver=ODBC Driver 11 for SQL Server;Server=myserver; Database=TestDb; trusted_connection=yes"
ConnectionString2="Driver=ODBC Driver 11 for SQL Server;Server=notmyserver; Database=TestDb; trusted_connection=yes"
db1=odbcDriverConnect(ConnectionString1)
query="SELECT a.[firstname] as FirstName
, a.[lastname] as LastName
, Cast(a.[dob] as datetime) as Birthday
, cast(a.createDate as datetime) as CreateDate
FROM [dbo].[People] a"
results=NULL
results=sqlQuery(db1,query,stringsAsFactors=FALSE)
close(db1)
db2=odbcDriverConnect(ConnectionString)
sqlSave(db2,
results,
append = TRUE,
varTypes=c(Birthday="datetime", CreateDate="datetime"),
colnames = FALSE,
rownames = FALSE,fast=FALSE)
close(db2)
The first part of the R code is just getting some test data into a dataframe--it works fine and it's not part of my question here (I'm just including it here so you can see what format the test data is). When I run the sqlSave
function I get an error message:
Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent
However, if I remove the primary key from the database, everything works fine with this table:
CREATE TABLE [dbo].[results] (
[FirstName] VARCHAR (255) NULL,
[LastName] VARCHAR (255) NULL,
[Birthday] DATETIME NULL,
[CreateDate] DATETIME NULL
);
Clearly the primary key is the issue. Normally with entity framework or whatever (as I understand it), the primary key is created at the database when you enter data.
I'd like a way to append data to a table with a primary key using only an R script. Is that possible? There could already be data in the table I'm adding to, so I don't really see a way to create keys in R before trying to append to the table.
The RODBC package provides functions that you can use to access the data in your database. In the RODBC package: Functions with names that begin with odbc invoke the ODBC functions that have similar names. Functions with names that begin with sql can be used to read, save, copy, and manipulate data between data frames and SQL tables.
The RODBC package provides functions that you can use to access the data in your database. The following script illustrates how to use the odbcConnect () method to establish a database connection. The specified data source name (DSN) is IDADB, and the believeNRows parameter is set to FALSE to avoid any initial connection issues.
Read data from SQL Server table dbo.Employees as a Spark dataframe using JDBC driver. Select a few columns from the table and then save this new dataframe into a new table named dbo.Employees2. In the sample code, I used a driver locates at 'sqljdbc_7.2/enu/mssql-jdbc-7.2.2.jre8.jar'.
Now let’s see how to go from the DataFrame to SQL, and then back to the DataFrame. For this example, you can create a new database called: ‘ test_database_2 ‘ Then, create the same products table using this syntax: Now, build the DataFrame: Apply the code to go from the DataFrame to SQL:
The problem is line 361 in http://github.com/cran/RODBC/blob/master/R/sql.R - the data.frame and the DB table must have exactly the same number of columns otherwise you get this error with this stacktrace:
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
3. `colnames<-`(`*tmp*`, value = c("ID", "FirstName", "LastName",
"Birthday", "CreateDate")) at sql.R#361
2. sqlwrite(channel, tablename, dat, verbose = verbose, fast = fast,
test = test, nastring = nastring) at sql.R#211
1. sqlSave(db2, results, append = TRUE, varTypes = c(Birthday = "datetime",
CreateDate = "datetime"), colnames = FALSE, rownames = FALSE,
fast = FALSE, verbose = TRUE)
If you add the ID column to your data.frame
you can no longer use the autoinc
ID column so this is no solution (or workaround).
A "simple" workaround to the "same columns" limitation of RODBC::sqlSave
is:
sqlSave
to save the new rows into another table nameinsert into ... select from ...
via RODBC::sqlQuery
to append the new rows to your original table that includes the autoinc ID
columndrop table...
)A better option would be to use the new odbc
package which also offers better performance through bulk-alike inserts instead of sending single insert
statements like RODBC
does:
https://github.com/r-dbi/odbc
Look for the function dbWriteTable
(which is an implementation of the interface DBI::dbWriteTable
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With