Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - how to prevent row.names when selecting rows from a data frame

Tags:

dataframe

r

row

Suppose I create a dataframe (just to keep it simple):

testframe <- data.frame( a = c(1,2,3,4), b = c(5,6,7,8))

Thus, I have two variables (columns) and four cases (rows).

If I select some of the rows BEGINNING WITH THE FIRST row, i get some kind of subset of the dataframe, e.g.:

testframe2 <- testframe[1:2,] #selecting the first two rows

But if i do the same with a row NOT BEGINNING WITH THE FIRST ROW, I get another column containing the row numbers of the original dataframe.

testframe3 <- testframe[3:4,] #selecting the last two rows

leads to:

  a b
3 3 7
4 4 8

What can I do to prevent the new row.names variable in the first place? I know that I can delete it afterwards but maybe it is still possible to avoid it from the beginning.

Thanks for your help!

like image 762
deschen Avatar asked Oct 24 '13 12:10

deschen


People also ask

How do I ignore row names in R?

To remove the row names or column names from a matrix, we just need to set them to NULL, in this way all the names will be nullified.

How do I remove row names from DF in R?

In case, we wish to delete the row names of the dataframe, then we can assign them to NULL using the rownames() method over the dataframe.

How do I select rows with certain names in R?

By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.

How do I select specific rows and columns from a Dataframe in R?

To select a specific column, you can also type in the name of the dataframe, followed by a $ , and then the name of the column you are looking to select. In this example, we will be selecting the payment column of the dataframe. When running this script, R will simplify the result as a vector.


1 Answers

It copies the row.names from the original dataset. Just rename the rows using rownames<- like this...

rownames( testframe3 ) <- seq_len( nrow( testframe3 ) )
#   a b
# 1 3 7
# 2 4 8

Programmatically seq_len( nrow( x ) ) is preferred to say 1:nrow( x ) because looks what happens in edge cases where you select a data.frame of zero rows...

df <- testframe[0,]
# [1] a b
# <0 rows> (or 0-length row.names)
rownames(df) <- seq_len( nrow( df ) ) #  No error thrown - returns a length 0 vector of rownames

#  But...
rownames(df) <- 1:nrow( df )
# Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
#   invalid 'row.names' length

#  Because...
1:nrow( df )
# [1] 1 0

Alternatively you can do it in one by wrapping the subset in a call to data.frame but this is really inefficient if you want to derive the number of rows programmatically (because you will have to subset twice) and I don't recommend it over the rownames<- method:

data.frame( testframe[3:4,] , row.names = 1:2 )
#  a b
#1 3 7
#2 4 8
like image 128
Simon O'Hanlon Avatar answered Oct 17 '22 06:10

Simon O'Hanlon