Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove data.table column labels/attributes (imported data)

With such rudimentary application, I'm having trouble removing data.table column labels/attributes from imported data (SAS)

My data.table DT is an import from a SAS file. Not all columns have labels, and some have two labels. I can't share my data as it's imported (so i can't replicate it), but here is a partial structure of DT:

> str(DT)
Classes ‘data.table’ and 'data.frame':  96293709 obs. of  150 variables:
 $ Col1               : chr  "Y" "N" "N" "N" ...
  ..- attr(*, "label")= chr "some label, description goes on and on"
 $ Col2               : chr  "N" "N" "N" "Y" ...
  ..- attr(*, "label")= chr "some label 2, description goes on and on"
 $ Col3                    : Date, format: "1994-08-07" "1994-08-07" "1994-08-07" "1994-08-07" ...
 $ Col4                          : chr  "M" "M" "M" "M" ...
  ..- attr(*, "label")= chr "some label 3, description goes on and on"
  ..- attr(*, "format.sas")= chr "$"
 $ Col5                     : num  1e+07 1e+07 1e+07 1e+07 1e+07 ...
  ..- attr(*, "label")= chr "some label 4, description goes on and on"
 $ Col6                       : Date, format: "2000-01-01" "2005-03-10" "2013-06-01" "2015-06-01" ...

I'm trying to remove all attributes, because when I use certain columns to create news ones these attributes are inherited in the new column, which is very annoying and undesired (prevents me from merging with another data.table without the labels). I thought the only way to prevent that is to remove the attributes (labels) from the original data DT.

I tried

> setattr(DT, "label", NULL)
> setattr(DT, "format.sas", NULL)

and i get no error. but nothing happens. after I try the above and check the structure, i get the same thing as before. labels/attributes have not been removed. what am I doing wrong here? I know i have to use setattr somehow as I don't want DT to be copied (it's rather large)

like image 628
Ankhnesmerira Avatar asked Mar 19 '26 08:03

Ankhnesmerira


1 Answers

The attributes are stored against each column, not for the data.table as a whole I think. Check attributes(DT) vs lapply(DT, attributes) and see if this is the case. Here's an example which I think replicates what you're trying to do:

DT <- data.table(a=1:3,b=2:4)
attr(DT$a, "label") <- "a label"
attr(DT$b, "label") <- "a label"
attr(DT$b, "sas format") <- "ddmmyy10."

str(DT)
#Classes ‘data.table’ and 'data.frame':  3 obs. of  2 variables:
# $ a: atomic  1 2 3
#  ..- attr(*, "label")= chr "a label"
# $ b: atomic  2 3 4
#  ..- attr(*, "label")= chr "a label"
#  ..- attr(*, "sas format")= chr "ddmmyy10."
# - attr(*, ".internal.selfref")=<externalptr> 

DT[, names(DT) := lapply(.SD, setattr, "label", NULL)]
DT[, names(DT) := lapply(.SD, setattr, "sas format", NULL)]

str(DT)
#Classes ‘data.table’ and 'data.frame':  3 obs. of  2 variables:
# $ a: int  1 2 3
# $ b: int  2 3 4
# - attr(*, ".internal.selfref")=<externalptr> 
like image 199
thelatemail Avatar answered Mar 21 '26 21:03

thelatemail



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!