Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use gsub to replace curly apostrophe with straight apostrophe in R list of character vectors

Looking for some guidance on how to replace a curly apostrophe with a straight apostrophe in an R list of character vectors.

The reason I'm replacing the curly apostrophes - later in the script, I check each list item, to see if it's found in a dictionary (using qdapDictionary) to ensure it's a real word and not garbage. The dictionary uses straight apostrophes, so words with the curly apostrophes are being "rejected."

A sample of the code I have currently follows. In my test list, item #6 contains a curly apostrophe, and item #2 has a straight apostrophe.

Example:

list_TestWords <- as.list(c("this", "isn't", "ideal", "but", "we", "can’t", "fix", "it"))

func_ReplaceTypographicApostrophes <- function(x) {
   gsub("’", "'", x, ignore.case = TRUE)
 }

list_TestWords_Fixed <- lapply(list_TestWords, func_ReplaceTypographicApostrophes)

The result: No change. Item 6 still using curly apostrophe. See output below.

list_TestWords_Fixed
[[1]]
[1] "this"

[[2]]
[1] "isn't"

[[3]]
[1] "ideal"

[[4]]
[1] "but"

[[5]]
[1] "we"

[[6]]
[1] "can’t"

[[7]]
[1] "fix"

[[8]]
[1] "it"

Any help you can offer will be most appreciated!

like image 247
SarahWeaver Avatar asked Oct 18 '17 16:10

SarahWeaver


People also ask

Should I use curly or straight apostrophe?

Functionally, both styles do the same thing (i.e., indicate the start and end of a quotation or passage of dialogue). But most publishers prefer curly quotes over straight as they're easier to read and differentiate the start and end of a quote.

How do you make an apostrophe in curly?

For the curly single opening and closing quote mark (or apostrophe), use &#8216; and &#8217; respectively. For the curly opening and closing double quotation marks, use &#8220; and &#8221; respectively.

How do you type a single straight apostrophe?

On a virtual keyboard, press and hold down the apostrophe key to get to the real apostrophe. This works with quotation marks too. On a PC, try Alt+0146 using the numeric keypad; on a Mac, press Option-Shift-]. In Word (the PC version), type the number 2019 (the Unicode number for an apostrophe) and then press Alt+X.


1 Answers

This might work: gsub("[\u2018\u2019\u201A\u201B\u2032\u2035]", "'", x)

I found it over here: http://axonflux.com/handy-regexes-for-smart-quotes

like image 138
bcarothers Avatar answered Oct 05 '22 13:10

bcarothers