Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: set duplicate 'row.names' to a numeric data frame

Tags:

r

My original data frame diasyhoras has 3 columns:

a) "Dia", "Visitas", "Hora"

I need to take the "Dia" column and put it's values as rownames.

str(diasyhoras)
'data.frame':   175 obs. of  3 variables:
 $ Dia    : Factor w/ 7 levels "Domingo","Jueves",..: 1 3 4 5 2 7 6 1 3 4 ...
 $ Visitas: num  271 493 787 853 285 712 782 16 157 734 ...
 $ Hora   : int  0 0 0 0 0 0 0 1 1 1 ...

The end goals was to use the new df(only numeric values) to plot a heatmap, using the d3heatmap library from Rstudio (I did not find a single tutorial on this package, so i'm doing my best).

So the help from d3heatmap says that the first argument should be a "A numeric matrix Defaults to TRUE unless x contains any NAs."

I've tried this:

1. diasyhoras2 <- diasyhoras[,-1] #Removes the "Dia" column and creates a new df.

2. rownames(diasyhoras2) <- diasyhoras[,1] 

However, step 2 gives me this error, because i do have duplicated values in my "Dia" column.

Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘Domingo’, ‘Jueves’, ‘Lunes’, ‘Martes’, ‘Miércoles’, ‘Sábado’, ‘Viernes’

UPDATE 1:

This is not possible and it was not necessary. What i needed to do is transform the data frame from "long" to "wide" to feed my heatmap (with reshape2). It was a nice exercise to try to do it using base R. Thanks to all.

like image 425
Omar Gonzales Avatar asked Jul 29 '15 17:07

Omar Gonzales


People also ask

How do I duplicate a row name in R?

It is not possible to have duplicate row names, but a simple workaround is creating an extra column (e.g. label) that holds the name that you would assign to your rows. You can then use this column for the names in the graph instead.

How do you assign row names in R?

Method 1 : using rownames() A data frame's rows can be accessed using rownames() method in the R programming language. We can specify the new row names using a vector of numerical or strings and assign it back to the rownames() method. The data frame is then modified reflecting the new row names.

Is the function to set row names for a data frame?

`. rowNamesDF<-` is a (non-generic replacement) function to set row names for data frames, with extra argument make.

How do I subset duplicates in R?

We can find the rows with duplicated values in a particular column of an R data frame by using duplicated function inside the subset function. This will return only the duplicate rows based on the column we choose that means the first unique value will not be in the output.


1 Answers

You can use make.names(..., unique = TRUE) to get unique row names

rownames(diasyhoras2) <- make.names(diasyhoras[,1], unique = TRUE)

Here's a quick example of what will happen to the names ...

rep(month.abb[1:2], 3)
# [1] "Jan" "Feb" "Jan" "Feb" "Jan" "Feb"
make.names(rep(month.abb[1:2], 3), unique = TRUE)
# [1] "Jan"   "Feb"   "Jan.1" "Feb.1" "Jan.2" "Feb.2"

Unfortunately there is no way around this if you want to use the days as row names of your data frame. In R, as the error states, duplicate row names are not allowed in data frames. They are, however, allowed in matrices so you may want to go that route instead. I am not familiar with the d3heatmap package so I cannot say whether you would get your desired result if you used a matrix.

x <- data.frame(a = rep(month.abb[1:2], 2))
rownames(x) <- x$a
# Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
#   duplicate 'row.names' are not allowed
# In addition: Warning message:
# non-unique values when setting 'row.names': ‘Feb’, ‘Jan’ 
m <- as.matrix(x)
rownames(m) <- x$a
m
#     a    
# Jan "Jan"
# Feb "Feb"
# Jan "Jan"
# Feb "Feb" 
like image 172
Rich Scriven Avatar answered Nov 03 '22 02:11

Rich Scriven