Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if character value is a valid R object name

Several months ago I asked something similar, but I was using JavaScript to check if provided string is a "valid" R object name. Now I'd like to achieve the same by using nothing but R. I suppose that there's a very nice way to do this, with some neat (not so) esoteric R function, so regular expressions seem to me as the last line of defence. Any ideas?

Oh, yeah, using back-ticks and stuff is considered cheating. =)

like image 244
aL3xa Avatar asked Dec 06 '11 07:12

aL3xa


People also ask

How do I check if something is a character in R?

character() Function in R Language is used to check if the object is of the form of a string/character or not. It will return true if any element of the object is of the character data type.

How do I find the name of an object in R?

Get or Set names of Elements of an Object in R Programming – names() Function. names() function in R Language is used to get or set the name of an Object. This function takes object i.e. vector, matrix or data frame as argument along with the value that is to be assigned as name to the object.

How do I check if an object is a function in R?

exists() function in R Programming Language is used to check if an object with the names specified in the argument of the function is defined or not. It returns TRUE if the object is found.

Is character the same as string in R?

In R, there's no fundamental distinction between a string and a character. A "string" is just a character variable that contains one or more characters. One thing you should be aware of, however, is the distinction between a scalar character variable, and a vector.


2 Answers

Edited 2013-1-9 to fix regular expression. Previous regular expression, lifted from page 456 of John Chambers' "Software for Data Analysis", was (subtly) incomplete. (h.t. Hadley Wickham)


There are a couple of issues here. A simple regular expression can be used to identify all syntactically valid names --- but some of those names (like if and while) are 'reserved', and cannot be assigned to.

  • Identifying syntactically valid names:

    ?make.names explains that a syntactically valid name:

    [...] consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. Names such as '".2way"' are not valid [...]

    Here is the corresponding regular expression:

     "^((([[:alpha:]]|[.][._[:alpha:]])[._[:alnum:]]*)|[.])$"
    
  • Identifying unreserved syntactically valid names

    To identify unreserved names, you can take advantage of the base function make.names(), which constructs syntactically valid names from arbitrary character strings.

    isValidAndUnreserved <- function(string) {
        make.names(string) == string
    }
    
    isValidAndUnreserved(".jjj")
    # [1] TRUE
    isValidAndUnreserved(" jjj")
    # [1] FALSE
    
  • Putting it all together

    isValidName <- function(string) {
        grepl("^((([[:alpha:]]|[.][._[:alpha:]])[._[:alnum:]]*)|[.])$", string)
    }
    
    isValidAndUnreservedName <- function(string) {
        make.names(string) == string
    }
    
    testValidity <- function(string) {
        valid <- isValidName(string)
        unreserved <- isValidAndUnreservedName(string)
        reserved <- (valid & ! unreserved)
        list("Valid"=valid,
             "Unreserved"=unreserved,
             "Reserved"=reserved)
    }
    
    testNames <- c("mean", ".j_j", ".", "...", "if", "while", "TRUE", "NULL",
                   "_jj", "  j", ".2way") 
    t(sapply(testNames, testValidity))
    
          Valid Unreserved Reserved
    mean  TRUE  TRUE       FALSE   
    .j_j  TRUE  TRUE       FALSE
    .     TRUE  TRUE       FALSE     
    ...   TRUE  TRUE       FALSE   
    if    TRUE  FALSE      TRUE    
    while TRUE  FALSE      TRUE    
    TRUE  TRUE  FALSE      TRUE    
    NULL  TRUE  FALSE      TRUE    
    _jj   FALSE FALSE      FALSE   
      j   FALSE FALSE      FALSE   # Note: these tests are for "  j", not "j"
    .2way FALSE FALSE      FALSE
    

For more discussion of these issues, see the r-devel thread linked to by @Hadley in the comments below.

like image 90
Josh O'Brien Avatar answered Oct 11 '22 05:10

Josh O'Brien


As Josh suggests, make.names is probably the best solution to this. Not only will it handle weird punctuation, it'll also flag reserved words:

make.names(".x")   # ".x"
make.names("_x")   # "X_x"
make.names("if")   # " if."
make.names("function")  # "function."
like image 37
Hong Ooi Avatar answered Oct 11 '22 05:10

Hong Ooi