Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to edit multi-person objects in R

Tags:

r

I find the following behaviour of the R person object rather unexpected:

Let's create a multi-person object:

a = c(person("Huck", "Finn"), person("Tom", "Sawyer"))

Imagine we want to update the given name of one person in the object:

a[[1]]$given <- 'Huckleberry'

Then if we inspect our object, to my surprise we have:

> a
[1] "  <> [] ()" "Tom Sawyer"

Where'd Huckleberry Finn go?! (Note that if we try this with just a single person object, it works fine.) Why does this happen?

How can we do the above so that we get the more logical behavior of correcting just the first name?

like image 402
cboettig Avatar asked Oct 21 '22 07:10

cboettig


1 Answers

The syntax you want here is

a <-  c(person("Huck", "Finn"), person("Tom", "Sawyer"))
a[1]$given<-"Huckleberry"
a

#[1] "Huckleberry Finn" "Tom Sawyer"

A group of people is still a "person" and it has it's own special indexing function [.person and concat function c.person so it has perhaps different behavior than you were expecting. The problem was that [[ ]] was messing with the underlying hidden list.

Actually, it's interesting because they've overloaded nearly all the indexing methods for person but not the [<- or [[<- and that's really what's causing the error. Because up to here, we're the same

`$<-`(`[`(a,1), "given", "Huckleberry") #works
`$<-`(`[[`(a,1), "given", "Huckleberry") #works

but when we get to

`[<-`(a, 1, `$<-`(`[`(a,1), "given", "Huckleberry")) #works
`[[<-`(a, 1, `$<-`(`[[`(a,1), "given", "Huckleberry")) #no work

we see a difference. The special wrapping/unwrapping that happens during retrieval does not happen during assignment.

So what's going on is that a "person" is always a list of lists. The outer list holds all the people and the inner lists hold the data. You can think of the data like this

x<-list(
    list(name="a"),list(name="b")
)

y<-list(
    list(name="c")
)

where x is a collection of two people and y is a "single" person. When you do

x[1]<-y
x

you end up with

list(
    list(name="c"),list(name="b")
)

since you're replacing a list with a list which is how [ indexing works with lists. But if you try to replace the element at [[1]] with a list of lists, that list will get nested. For example

x[[1]]<-y
x

becomes

x<-list(
    list(list(name="c")),list(name="b")
)

And that extra level of nesting is what's confusing R when it goes to print the person in the first position. That first person won't have any named elements at the second level, so when it goes to print, it will return

emptyp <- structure(list(structure(list(), class="person")), class="person")
utils:::format.person(emptyp)
#  "  <> [] ()"

which gives is the symbols where it's trying to place the name, e-mail address, role, and comment.

like image 173
MrFlick Avatar answered Oct 23 '22 02:10

MrFlick