How to append new data in googlesheet

Tags:

I have a below mentioned dataframe in R which I'm fetching from database using RMySQL on interval of 3 hours.

Query:

Select Form_No, Date, Name, Age from Test where Date(Date)>='2020-12-01';

DF:

Form_No      Date                  Name            Age
POS-1        2020-12-01 10:16:12   Alex            29
POS-2        2020-12-01 10:18:34   Mike            31
POS-3        2020-12-01 14:16:22   Paul            21
POS-4        2020-12-01 18:33:14   John            27
POS-5        2020-12-01 20:13:34   Cohe            45

It is oblivious that after every 3 hours when I run the script there are some additional entries might have created in dataframe and when I use the below-mentioned code to upload the data in googlesheet it override the previous information and update the new data.

The Problem is that for each rows there are some observations might have captured by the reviewer of the information.

The Code I am using is:

library(dplyr)
library(stringr)
library(RMySQL)
library(googlesheets)

connection<-dbConnect(MySQL(), user='User', password='Password', 
                 dbname='Db name',
                 host='host info')

Query<- paste0("Select Form_No, Date, Name, Age from Test where Date(Date)>='2020-12-01';")
DF <- dbGetQuery(connection,Query)

setwd("/home/path")
write.csv(x = DF, file = "cust_data.csv", row.names = FALSE)

as<-gs_title("TargetSheet")
gs_upload("cust_data.csv", sheet_title = "TargetSheet", overwrite = TRUE)

I want to keep the sequence of Form_No and every details same while uploading the new batch of information.

For Example, If I had run the query at 06:00 AM and there were five entry as shown in DF and all of them uploaded to google sheet, now if I run the script at 09:00 am then in my query there is possibility that along with those five entry there are some more entries.

I need to append those extra entries in google sheet now except for the five entries which are already uploaded last time.

524

asked Dec 06 '20 13:12

Sophia Wilson

2 Answers

Suppose you've saved your query into a Google Sheet named "DatabaseOutput": enter image description here

You can read the data with googlesheets4::read_sheet. To insure you get the correct column types every time, I recommend supplying the col_types = argument. It is very important that read_sheet returns the same column types as your SQL query returns. You can find more information about the argument on the readr documentation page.

library(googledrive)
library(googlesheets4)
OldSheet <- read_sheet(drive_get("DatabaseOutput"),
                       col_types = "cTci")
OldSheet
## A tibble: 3 x 4
#  Form_No Date                Name    Age
#  <chr>   <dttm>              <chr> <dbl>
#1 POS-1   2020-12-01 10:16:12 Alex     29
#2 POS-2   2020-12-01 10:18:34 Mike     31
#3 POS-3   2020-12-01 14:16:22 Paul     21

Your SQL query and read from Google Sheets may disagree on exactly what time it is, so I recommend flooring the Date column with lubridate:

library(dplyr)
library(lubridate)
DF <- as_tibble(DF) %>% 
  mutate(Date = as_datetime(floor(seconds(Date))))
OldSheet <- OldSheet %>%
  mutate(Date = as_datetime(floor(seconds(Date))))

Now we can find rows which are not in the old data with dplyr::anti_join. If this doesn't work, double check that the column types are all the same.

NewRows <- anti_join(DF,OldSheet)
NewRows
## A tibble: 2 x 4
#  Form_No Date                Name    Age
#  <chr>   <dttm>              <chr> <int>
#1 POS-4   2020-12-01 18:33:14 John     27
#2 POS-5   2020-12-01 20:13:34 Cohe     45

Now you can append the rows to the Google Sheet with sheet_append:

sheet_append(drive_get("DatabaseOutput"),NewRows)

enter image description here

189

answered Oct 19 '22 00:10

Ian Campbell

*Edited to add: I understand your question better now; you just want to append the new entries because someone else with access to your google sheet may have reviewed and made edits to previously uploaded entries. Most of my answer will remain the same as before.

Try using the googlesheets4 package instead of googlesheets.

This is what worked for me.

as = gs4_create("TargetSheet")
DF = as.data.frame(read.csv('cust_data.csv'))
lastRow = nrow(DF)
sheet_write(DF, ss = as, sheet = "TargetSheet")

### 3 Hours later and beyond ###

DF = as.data.frame(read.csv('cust_data.csv'))
newEntries = DF[(lastRow + 1):nrow(DF), ]
lastRow = nrow(DF)
sheet_append(ss = as, data = newEntries, sheet = "TargetSheet")

So for any additional entries, just keep running the last 4 lines of code every three hours. Another potential solution is basing each new query off the last known position from your previous upload, but I think my pure R solution should do just fine.

answered Oct 19 '22 00:10

Econundrums

Related questions
                            
                                How to correctly set up rpy2?
                            
                                Is there an R function "parallel sum"? [duplicate]
                            
                                How to match distinct repeated characters
                            
                                How do you convert object of class Eigen::MatrixXd to class Rcpp::NumericMatrix
                            
                                Display YAML and chunks without executing them in blogdown
                            
                                How can I use dplyr across() programmatically on no variables?
                            
                                How to use label_wrap_gen with as_labeller in facet_wrap
                            
                                Is there an R function to replace a matched RegEx with a string of characters with the same length? [duplicate]
                            
                                render dropdown for single column in DT shiny BUT loaded only on cell click and with replaceData()
                            
                                How can I count the total number of occurrences at time step t of an element?
                            
                                R Shuffle and randomize columns of a data table
                            
                                How to replace exact number of characters in string based on occurrence between delimitors in R
                            
                                Using mutate with map2 and exec instead of invoke_map
                            
                                Error in summary.connection(connection) : invalid connection
                            
                                Is there a way to multiply the 2d matrices of a 3d array by a scalar in R?
                            
                                Create summary table in R using statistics from package `modifiedmk`
                            
                                Remove linear dependent variables while using the bife package
                            
                                Lookaround regular expression pattern in R
                            
                                Change color of leaflet marker
                            
                                Knit PDf file from RStudio

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to append new data in googlesheet

Tags:

dataframe

r

dplyr

google-sheets

r-googlesheets

Sophia Wilson

People also ask

2 Answers

Ian Campbell

Econundrums

Recent Activity

Donate For Us