How to use RODBC to save dataframe to table with primary key generated at database

Q: What is the rodbc package used for?

The RODBC package provides functions that you can use to access the data in your database. In the RODBC package: Functions with names that begin with odbc invoke the ODBC functions that have similar names. Functions with names that begin with sql can be used to read, save, copy, and manipulate data between data frames and SQL tables.

Q: How do I access the data in my rodbc database?

The RODBC package provides functions that you can use to access the data in your database. The following script illustrates how to use the odbcConnect () method to establish a database connection. The specified data source name (DSN) is IDADB, and the believeNRows parameter is set to FALSE to avoid any initial connection issues.

Q: How to read data from SQL Server table to spark dataframe?

Read data from SQL Server table dbo.Employees as a Spark dataframe using JDBC driver. Select a few columns from the table and then save this new dataframe into a new table named dbo.Employees2. In the sample code, I used a driver locates at 'sqljdbc_7.2/enu/mssql-jdbc-7.2.2.jre8.jar'.

Q: How to go from The Dataframe to SQL?

Now let’s see how to go from the DataFrame to SQL, and then back to the DataFrame. For this example, you can create a new database called: ‘ test_database_2 ‘ Then, create the same products table using this syntax: Now, build the DataFrame: Apply the code to go from the DataFrame to SQL:

Tags:

sql

sql-server

r

rodbc

I would like to enter a data frame into an existing table in a database using an R script, and I want the table in the database to have a sequential primary key. My problem is that RODBC doesn't seem to allow the primary key constraint.

Here's the SQL for creating the table I want:

CREATE TABLE [dbo].[results] (
    [ID]         INT            IDENTITY (1, 1) NOT NULL,
    [FirstName]  VARCHAR (255) NULL,
    [LastName]   VARCHAR (255) NULL,
    [Birthday]   DATETIME      NULL,
    [CreateDate] DATETIME      NULL,
    CONSTRAINT [PK_dbo.results] PRIMARY KEY CLUSTERED ([ID] ASC)
);

And a test with some R code:

ConnectionString1="Driver=ODBC Driver 11 for SQL Server;Server=myserver; Database=TestDb; trusted_connection=yes"
ConnectionString2="Driver=ODBC Driver 11 for SQL Server;Server=notmyserver; Database=TestDb; trusted_connection=yes"
db1=odbcDriverConnect(ConnectionString1)    
query="SELECT a.[firstname] as FirstName
  , a.[lastname] as LastName
  , Cast(a.[dob] as datetime) as Birthday
  , cast(a.createDate as datetime) as CreateDate
FROM [dbo].[People] a"
results=NULL
results=sqlQuery(db1,query,stringsAsFactors=FALSE)
close(db1)

db2=odbcDriverConnect(ConnectionString)
sqlSave(db2, 
    results, 
    append = TRUE, 
    varTypes=c(Birthday="datetime", CreateDate="datetime"),
    colnames = FALSE,  
    rownames = FALSE,fast=FALSE)
close(db2)

The first part of the R code is just getting some test data into a dataframe--it works fine and it's not part of my question here (I'm just including it here so you can see what format the test data is). When I run the sqlSave function I get an error message:

Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent

However, if I remove the primary key from the database, everything works fine with this table:

CREATE TABLE [dbo].[results] (
    [FirstName]  VARCHAR (255) NULL,
    [LastName]   VARCHAR (255) NULL,
    [Birthday]   DATETIME      NULL,
    [CreateDate] DATETIME      NULL
);

Clearly the primary key is the issue. Normally with entity framework or whatever (as I understand it), the primary key is created at the database when you enter data.

I'd like a way to append data to a table with a primary key using only an R script. Is that possible? There could already be data in the table I'm adding to, so I don't really see a way to create keys in R before trying to append to the table.

965

asked May 12 '18 15:05

Matthew

1 Answers

The problem is line 361 in http://github.com/cran/RODBC/blob/master/R/sql.R - the data.frame and the DB table must have exactly the same number of columns otherwise you get this error with this stacktrace:

Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent 
3. `colnames<-`(`*tmp*`, value = c("ID", "FirstName", "LastName", 
   "Birthday", "CreateDate")) at sql.R#361
2. sqlwrite(channel, tablename, dat, verbose = verbose, fast = fast, 
   test = test, nastring = nastring) at sql.R#211
1. sqlSave(db2, results, append = TRUE, varTypes = c(Birthday = "datetime", 
    CreateDate = "datetime"), colnames = FALSE, rownames = FALSE, 
    fast = FALSE, verbose = TRUE)

If you add the ID column to your data.frame you can no longer use the autoinc ID column so this is no solution (or workaround).

A "simple" workaround to the "same columns" limitation of RODBC::sqlSave is:

Use sqlSave to save the new rows into another table name
Send an insert into ... select from ... via RODBC::sqlQuery to append the new rows to your original table that includes the autoinc ID column
Delete the table with the new rows again (drop table...)

A better option would be to use the new odbc package which also offers better performance through bulk-alike inserts instead of sending single insert statements like RODBC does:

https://github.com/r-dbi/odbc

Look for the function dbWriteTable (which is an implementation of the interface DBI::dbWriteTable).

120

answered Sep 24 '22 19:09

R Yoda

Related questions
                            
                                Divide rows with date in SQL Server 2014
                            
                                postgresql cannot create table with pseudo-type record[]
                            
                                How do I insert random characters into a sql database column?
                            
                                PostgreSQL extract keys from jsonb, exception "cannot call jsonb_object_keys on a scalar"
                            
                                Checking and preventing similar strings while insertion in MySQL
                            
                                Error: The type of the value (DBNull) being assigned to variable "User:: differs from the current variable type (String)
                            
                                How to do repeatable sampling in BigQuery Standard SQL?
                            
                                SQL group by: select value where another column has its min/max
                            
                                SQL query to count a column in all tables
                            
                                PostgreSQL - Grant select on all tables (and future tables), in *all schemas*
                            
                                Human readable elapsed time between many days
                            
                                Using If Not Exists on Primary Key
                            
                                How do I group on continuous ranges (mysql 5.7)
                            
                                How to insert a vector into a column of a table in mysql?
                            
                                how to generate SQL from dbplyr without a database connection?
                            
                                Does MySQL support partial indexes?
                            
                                Does SELECT start transaction in PL/SQL
                            
                                LAG() / LEAD() of the next rank (Postgresql)
                            
                                WHERE vs. HAVING performance with GROUP BY
                            
                                Convert string variable to GUID

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With