I have a data.table where some columns names are NA
. trying to change them to a character name fails and they stay NA
.
I manage to replace them by switching to a data.frame, though, but is there a way with data.table
?
dt <- data.table(a = 1:2, b = 2:3)
setDF(dt)
names(dt) <- c(NA,"c")
setDT(dt)
names(dt) <- c("a","b")
names(dt)
# [1] NA "b"`
Using data.frame it works:
setDF(dt)
names(dt) <- c("a", "b")
names(dt)
# [1] "a" "b"`
EDIT: @akrun suggests to use NA_character_ but that doesn't work for several NA in the names (which is my case but the example above was simplified)
dt <- data.table(a = 1:2, b = 2:3, c = 2:3)
setDF(dt)
names(dt) <- c(NA,NA,"c")
setDT(dt)
setnames(dt, NA_character_, c('a','b'))
Error in
setnames(dt, NA_character_, c("a", "b"))
: Some items ofold
are duplicated (ambiguous) in column names:NA
setnames(dt, c(NA_character_,NA_character_), c('a','b'))
Error in
setnames(dt, c(NA_character_, NA_character_), c("a", "b"))
: Some duplicates exist inold
:NA
PS:
sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-suse-linux-gnu (64-bit)
Running under: SUSE Linux Enterprise Desktop 12 SP2
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas_serial.so.0
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8
[4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] rvest_0.3.2 xml2_1.1.9000 bindrcpp_0.2 shiny_1.0.5
[5] dplyr_0.7.3 RUnit_0.4.31 gjpoisson_0.4
[17] gplots_3.0.1 moments_0.14 foreach_1.4.3 ggplot2_2.2.1
[21] RODBC_1.3-15 data.table_1.10.4-3 mgcv_1.8-21 nlme_3.1-131
[25] pacman_0.4.6 devtools_1.13.3
loaded via a namespace (and not attached):
[1] bitops_1.0-6 xts_0.10-0 lubridate_1.6.0 bit64_0.9-7
[5] httr_1.3.1 quantDb_0.4.0 RColorBrewer_1.1-2 tools_3.4.2
[9] backports_1.1.1 rredis_1.7.0 R6_2.2.2
[13] KernSmooth_2.23-15 rpart_4.1-11 Hmisc_4.0-3 DBI_0.7
[17] lazyeval_0.2.0 colorspace_1.3-2 nnet_7.3-12 withr_2.0.0
[21] gridExtra_2.3 curl_2.8.1 bit_1.1-12 compiler_3.4.2
[25] htmlTable_1.9 caTools_1.17.1 scales_0.5.0 dygraphs_1.1.1.4
[29] checkmate_1.8.3 odbc_1.1.1 speedglm_0.3-2 stringr_1.2.0
[33] digest_0.6.12 foreign_0.8-69 datashop_0.13.2 base64enc_0.1-3
[37] pkgconfig_2.0.1 htmltools_0.3.6 htmlwidgets_0.9 rlang_0.1.2
[41] ggthemes_3.4.0 bindr_0.1 zoo_1.8-0 gtools_3.5.0
[45] acepack_1.4.1 inline_0.3.14 marketUtils_0.3.8 magrittr_1.5
[49] Formula_1.2-2 Matrix_1.2-11 Rcpp_0.12.12 munsell_0.4.3
[53] stringi_1.1.5 yaml_2.1.14 MASS_7.3-47 RJSONIO_1.3-0
[57] plyr_1.8.4 grid_3.4.2 blob_1.1.0 gdata_2.18.0
[61] ggrepel_0.6.5 lattice_0.20-35 splines_3.4.2 fasttime_1.0-2
[65] hms_0.3 knitr_1.17 reshape2_1.4.2 codetools_0.2-15
[69] fctsUtils_0.4.7 XML_3.98-1.9 glue_1.1.1 latticeExtra_0.6-28
[73] selectr_0.3-1 httpuv_1.3.5 gtable_0.2.0 purrr_0.2.4
[77] tidyr_0.7.1 assertthat_0.2.0 mime_0.5 xtable_1.8-2
[81] survival_2.41-3 quantum_0.13.1 tibble_1.3.4 iterators_1.0.8
[85] memoise_1.1.0 cluster_2.0.6
>
To rename a column in R you can use the rename() function from dplyr. For example, if you want to rename the column “A” to “B”, again, you can run the following code: rename(dataframe, B = A) .
rename() is the method available in the dplyr library which is used to change the multiple columns (column names) by name in the dataframe. The operator – %>% is used to load the renamed column names to the dataframe. At a time it will change single or multiple column names.
names() creates name attributes where as colnames() simply names the columns.
This was a bug, thanks for identifying the issue and providing a reproducible example! You should be able to install the current development version of data.table
(1.10.5) with:
install.packages('data.table', type = 'source',
repos = 'http://Rdatatable.github.io/data.table')
If that doesn't work directly, please consult the Installation Wiki.
If you're unable to install this version (administrative rights or can only install from CRAN), here's a workaround: the bug emerges when only the old
argument of setnames
is present (in which case it is somewhat paradoxically --though I think intuitively in usage -- interpreted as new
).
So to get around this, we need only be sure to use both old
and new
arguments to setnames
:
setnames(dt, seq_along(dt), c('a', 'b', 'c'))
dt
# a b c
# 1: 1 2 2
# 2: 2 3 3
We can't use names(dt)
in the old
argument, because there are duplicates in names(dt)
, and when old
is character
, we need to be able to match 1-1 the old
and new
names (i.e., does a
belong to the first NA
or the second? The same problem would arise if names(dt)
was c('a', 'a', 'b')
to start (i.e., that's a separate issue). To get around this, we specify the positions instead of the names.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With