Suppose I create a dataframe (just to keep it simple):
testframe <- data.frame( a = c(1,2,3,4), b = c(5,6,7,8))
Thus, I have two variables (columns) and four cases (rows).
If I select some of the rows BEGINNING WITH THE FIRST row, i get some kind of subset of the dataframe, e.g.:
testframe2 <- testframe[1:2,] #selecting the first two rows
But if i do the same with a row NOT BEGINNING WITH THE FIRST ROW, I get another column containing the row numbers of the original dataframe.
testframe3 <- testframe[3:4,] #selecting the last two rows
leads to:
a b
3 3 7
4 4 8
What can I do to prevent the new row.names variable in the first place? I know that I can delete it afterwards but maybe it is still possible to avoid it from the beginning.
Thanks for your help!
To remove the row names or column names from a matrix, we just need to set them to NULL, in this way all the names will be nullified.
In case, we wish to delete the row names of the dataframe, then we can assign them to NULL using the rownames() method over the dataframe.
By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.
To select a specific column, you can also type in the name of the dataframe, followed by a $ , and then the name of the column you are looking to select. In this example, we will be selecting the payment column of the dataframe. When running this script, R will simplify the result as a vector.
It copies the row.names
from the original dataset. Just rename the rows using rownames<-
like this...
rownames( testframe3 ) <- seq_len( nrow( testframe3 ) )
# a b
# 1 3 7
# 2 4 8
Programmatically seq_len( nrow( x ) )
is preferred to say 1:nrow( x )
because looks what happens in edge cases where you select a data.frame
of zero rows...
df <- testframe[0,]
# [1] a b
# <0 rows> (or 0-length row.names)
rownames(df) <- seq_len( nrow( df ) ) # No error thrown - returns a length 0 vector of rownames
# But...
rownames(df) <- 1:nrow( df )
# Error in `row.names<-.data.frame`(`*tmp*`, value = value) :
# invalid 'row.names' length
# Because...
1:nrow( df )
# [1] 1 0
Alternatively you can do it in one by wrapping the subset in a call to data.frame
but this is really inefficient if you want to derive the number of rows programmatically (because you will have to subset twice) and I don't recommend it over the rownames<-
method:
data.frame( testframe[3:4,] , row.names = 1:2 )
# a b
#1 3 7
#2 4 8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With