Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr invalid subscript type list

Tags:

r

dplyr

I have run into an error in a script I am writing that only occurs when I have dplyr running. I first encountered it when I found a function from dplyr that I wanted to use, after which I installed and ran the package. Here is an example of my error:

First I read in a table from excel that has column values I am going to use as indices in it:

library(readxl)
examplelist <- read_excel("example.xlsx")

The contents of the file are:

1   2   3   4
1   1   4   1
2   3   2   1
4   4   1   4

And then I build a data frame:

testdf = data.frame(1:12, 13:24, 25:36, 37:48)

And then I have a loop that calls a function that uses the values of examplelist as indices.

testfun <- function(df, a, b, c, d){
  value1 <- df[[a]]
  value2 <- df[[b]]
  value3 <- df[[c]]
  value4 <- df[[d]]
}

for (i in 1:nrow(examplelist)){
  testfun(testdf, examplelist[i, 1], examplelist[i, 2], 
      examplelist[i, 3], examplelist[i, 4])
}

When I run this script without dplyr, everything is fine, but with dplyr it gives me the error:

 Error in .subset2(x, i, exact = exact) : invalid subscript type 'list' 

Why would having dplyr cause this error, and how can I fix it?

like image 341
Walker in the City Avatar asked Dec 08 '17 22:12

Walker in the City


2 Answers

I think MKR's answer is a valid solution, I will elaborate a bit more on the why with some alternatives.

The readxl library is part of the tidyverse and returns a tibble (tbl_df) with the function read_excel. This is a special type of data frame and there are differences from base behaviour, notably printing and subsetting (read here).

Tibbles also clearly delineate [ and [[: [ always returns another tibble, [[ always returns a vector. No more drop = FALSE

So you can see now that your examplelist[i, n] will return a tibble and not a vector of length 1, which is why using as.numeric works.

library(readxl)

examplelist <- read_excel("example.xlsx")

class(examplelist[1, 1])
# [1] "tbl_df"     "tbl"        "data.frame"

class(examplelist[[1, 1]])
# [1] "numeric"

class(as.numeric(examplelist[1, 1]))
# [1] "numeric"

class(as.data.frame(examplelist)[1, 1])
# [1] "numeric"

My workflow tends towards using the tidyverse so you could use [[ to subset or as.data.frame if you don't want tibbles.

like image 189
Kevin Arseneau Avatar answered Sep 22 '22 09:09

Kevin Arseneau


I can see this issue even without loading dplyr. It seems the culprit is use of examplelist items. if you print the value of examplelist[1, 2] then it is 1x1 dimension data.frame. But the value of a, b, c and d are expected to be a simple number. Hence if you change examplelist[i, 1] etc using as.numeric then the error will be avoided. Change call of testfun as:

testfun(testdf, as.numeric(examplelist[i, 1]), as.numeric(examplelist[i, 2]), 
          as.numeric(examplelist[i, 3]), as.numeric(examplelist[i, 4]))
like image 39
MKR Avatar answered Sep 20 '22 09:09

MKR