Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

substitute letters with corresponding set of letters

Tags:

regex

r

I stuck on a minor problem and I haven't found the right search terms for it. I have letters from "A" - "N" and want to replace these one greater than "G" with "A"-"G" according to their position in the alphabet. using gsub for that seems cumbersome. Or are there any regex that can do it smarter?

k <- rep(LETTERS[1:14],2)
gsub(pattern="H", replace="A", x=k)
gsub(pattern="I", replace="B", x=k)
gsub(pattern="J", replace="C", x=k)
gsub(pattern="K", replace="D", x=k)
# etc.

Isn't there some way I can convert the the characters to integer and then simply calculate within the integer values and afterwards casting back? Or is there any inverse of LETTERS? as.numeric() and as.integer() returns NA.

like image 811
Sebastian Avatar asked Jun 23 '12 18:06

Sebastian


3 Answers

This translates H-N to A-G:

chartr("HIJKLMN", "ABCDEFG", k)
like image 128
G. Grothendieck Avatar answered Oct 04 '22 00:10

G. Grothendieck


My first thought whenever I see problems like this is match:

AG <- LETTERS[1:7]
HN <- LETTERS[8:14]

k <- rep(LETTERS[1:14],2)
n <- AG[match(k, HN)]
ifelse(is.na(n), k, n)
# [1] "A" "B" "C" "D" "E" "F" "G" "A" "B" "C" "D" "E" "F" "G" "A" "B" "C" "D" "E"
#[20] "F" "G" "A" "B" "C" "D" "E" "F" "G"

I'd construct an inverse LETTERS function the same way:

invLETTERS <- function(x) match(x, LETTERS[1:26])
invLETTERS(k)
# [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14  1  2  3  4  5  6  7  8  9 10 11
#[26] 12 13 14
like image 28
Aaron left Stack Overflow Avatar answered Oct 04 '22 01:10

Aaron left Stack Overflow


Here's a clean and straightforward solution:

k <- rep(LETTERS[1:14],2)

# (1) Create a lookup vector whose elements can be indexed into  
#     by their names and will return their associated values
subs <- setNames(rep(LETTERS[1:7], 2), LETTERS[1:14])
subs
#   A   B   C   D   E   F   G   H   I   J   K   L   M   N 
# "A" "B" "C" "D" "E" "F" "G" "A" "B" "C" "D" "E" "F" "G" 

# (2) Use it.
unname(subs[k])
#  [1] "A" "B" "C" "D" "E" "F" "G" "A" "B" "C" "D" "E" "F" "G"
# [15] "A" "B" "C" "D" "E" "F" "G" "A" "B" "C" "D" "E" "F" "G"
like image 33
Josh O'Brien Avatar answered Oct 03 '22 23:10

Josh O'Brien