Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why this behavior when coercing a list to character via as.character()?

Tags:

list

r

coercion

In the process of (mostly) answering this question, I stumbled across something that I feel like I really should already have seen before. Let's say you've got a list:

l <- list(a = 1:3, b = letters[1:3], c = runif(3))

Attempting to coerce l to various types returns an error:

> as.numeric(l)
Error: (list) object cannot be coerced to type 'double'
> as.logical(l)
Error: (list) object cannot be coerced to type 'logical'

However, I'm apparently allowed to coerce a list to character, I just wasn't expecting this result:

> as.character(l)
[1] "1:3"                                                        
[2] "c(\"a\", \"b\", \"c\")"                                     
[3] "c(0.874045701464638, 0.0843329173512757, 0.809434881201014)"

Rather, if I'm allowed to coerce lists to character, I would have thought I'd see behavior more like this:

> as.character(unlist(l))
[1] "1"                  "2"                  "3"                  "a"                  "b"                 
[6] "c"                  "0.874045701464638"  "0.0843329173512757" "0.809434881201014"

Note that how I specify the list elements originally affects the output of as.character:

l <- list(a = c(1,2,3), b = letters[1:3], c = runif(3))
> as.character(l)
[1] "c(1, 2, 3)"                                                 
[2] "c(\"a\", \"b\", \"c\")"                                     
[3] "c(0.344991483259946, 0.0492411875165999, 0.625746068544686)"

I have two questions:

  1. How is as.character dredging up the information from my original creation of the list l in order to spit out 1:3 versus c(1,2,3).
  2. In what circumstances would I want to do this, exactly? When would I want to call as.character() on a list and get output of this form?
like image 987
joran Avatar asked Sep 29 '11 01:09

joran


2 Answers

For non-trivial lists, as.character uses deparse to generate the strings.

  1. Only if the vector is integer and 1,2,3,...,n - then it deparses as 1:n.

    c(1,2,3) is double whereas 1:3 is integer...

  2. No idea :-)

...but look at deparse if you want to understand as.character here:

deparse(c(1L, 2L, 3L)) # 1:3
deparse(c(3L, 2L, 1L)) # c(3L, 2L, 1L)
deparse(c(1, 2, 3))    # c(1, 2, 3)
like image 112
Tommy Avatar answered Oct 22 '22 12:10

Tommy


The help file does say

For lists it deparses the elements individually, except that it extracts the first element of length-one character vectors.

I'd seen this before in trying to answer a question [not online] about grep. Consider:

> x <- list(letters[1:10],letters[10:19])
> grep("c",x)
[1] 1 2

grep uses as.character on x, with the result that, since both have c( in them, both components match. That took a while to figure out.

On "Why does it do this?", I'd guess that one of the members of R core wanted it to do this.

like image 6
Karl Avatar answered Oct 22 '22 13:10

Karl