Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dplyr - Error: column '' has unsupported type

I have a odd issue when using dplyr on a data.frame to compute the number of missing observations for each group of a character variable. This creates the error "Error: column "" has unsupported type.

To replicate it I have created a subset. The subset rdata file is available here: rdata file including dftest data.frame First. Using the subset I have provided, the code:

dftest %>%
  group_by(file) %>%
  summarise(missings=sum(is.na(v131)))

Will create the error: Error: column 'file' has unsupported type

The str(dftest) returns:

'data.frame':   756345 obs. of  2 variables:
 $ file: atomic  bjir31fl.dta bjir31fl.dta bjir31fl.dta bjir31fl.dta ...
  ..- attr(*, "levels")= chr 
 $ v131: Factor w/ 330 levels "not of benin",..: 6 6 6 6 1 1 1 9 9 9 ...

However, taking a subset of the subset, and running the dplyr command again, will create the expected output.

dftest <- dftest[1:756345,]
dftest %>%
  group_by(file) %>%
  summarise(missings=sum(is.na(v131)))

The str(dftest) now returns:

'data.frame':   756345 obs. of  2 variables:
 $ file: chr  "bjir31fl.dta" "bjir31fl.dta" "bjir31fl.dta" "bjir31fl.dta" ...
 $ v131: Factor w/ 330 levels "not of benin",..: 6 6 6 6 1 1 1 9 9 9 ...

Anyone have any suggestions about what might cause this error, and what to do about it. In my original file I have 300 variables, and dplyr states that most of these are of unsupported type.

Thanks.

like image 820
spesseh Avatar asked Feb 11 '23 07:02

spesseh


1 Answers

This seems to be an issue with using filter when a column of the data frame has an attribute. For example,

> df = data.frame(x=1:10, y=1:10)
> filter(df, x==3) # Works
  x y
1 3 3

Add an attribute to the x column. Notice that str(df) shows x as atomic now, and filter doesn't work:

> attr(df$x, 'width')='broad'
> str(df)
'data.frame':   10 obs. of  2 variables:
 $ x: atomic  1 2 3 4 5 6 7 8 9 10
  ..- attr(*, "width")= chr "broad"
 $ y: int  1 2 3 4 5 6 7 8 9 10
> filter(df, x==3)
Error: column 'x' has unsupported type

To make it work, remove the attribute:

> attr(df$x, 'width') = NULL
> filter(df, x==3)
  x y
1 3 3
like image 192
Kent Johnson Avatar answered Feb 25 '23 21:02

Kent Johnson