Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing escaped double quotes by double quotes in R

I am writing some html code to an SQL databse using RMySQL (but I guess my problem is rather a general R question than really related to SQL or RMySQL). So I am trying something like this:

con <- RMySQL(...) # some connection    
html <- "<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>"    
    query <- c('INSERT INTO table (htmlfield) VALUES (\"', html, '"') 
    dbSendQuery(con,paste(query, collapse = ""))

Trouble is, R's paste will replace the double quotes in single quotes (i.e. '"') to the escaped sequence \", i.e.:

> paste(query, collapse = "")
[1] "INSERT INTO table (htmlfield) VALUES (\"<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>\""

If I change the single quotes in the vector query to double quotes, and the single quotes in html to doubles, the problem is then on the side of the character string html, since then the double quotes in html get replaced by the escaped sequence.

What's the easiest way to handle a substitution of the escaped characters?

I tried gsub('\\\"','"',html) which did not work as intended and the solutions suggested in the post Ignore escape characters (backslashes) in R strings but I could not make it work.

Thanks for your attention, Philipp

like image 296
Philipp Avatar asked Dec 16 '22 18:12

Philipp


2 Answers

I see two problems with what you included in your Question. The first looks like a typo. After:

html <- "<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>"   

You have:

query <- c('INSERT INTO table (htmlfield) VALUES (\"', html, '"')
                                                  ^^^^^^^^^^^^^^^

Notice you escape one string but not the other. You don't need to escape them, but it doesn't matter if you do. You also meant '")' for the last string which is, I suspect, the real source of the error you are getting. paste rather than c is more useful here. If I combine these, we get:

query <- paste('INSERT INTO table (htmlfield) VALUES ("', html, '")', sep = "")

that we can use directly:

dbSendQuery(con, query)

The second problem, and one that many people make, is to confuse the printed representation of an object with the object itself. If we print query, we see this:

> query
[1] "INSERT INTO table (htmlfield) VALUES (\"<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>\")"

The printed representation of the string is always enclosed in "" double quotes, and as such the internal " need to be escaped. What you want to look at is the actual string. We can do that with cat or writeLines - I prefer the latter as it adds the "\n" to the end of the string automagically:

> writeLines(query)
INSERT INTO table (htmlfield) VALUES ("<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>")

Notice how the " are now not escaped. That is the SQL that would be executed by the database server. If that is valid SQL for your DB then it will work.

like image 78
Gavin Simpson Avatar answered Dec 19 '22 06:12

Gavin Simpson


You forgot to escape the backslashes themselves in the strings (I think, it would be the case with most programming languages, not sure if the same goes for R).

gsub("\\\"", "\"", html)
like image 25
orlp Avatar answered Dec 19 '22 07:12

orlp