Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird behaviour by ordering a data frame

Tags:

dataframe

r

I have the following data frame that I want to order by the fifth column ("Distance"). When I try `

df.order <- df[order(df[, 5]), ]

I always get the following error message.

Error in order(df[, 5]) : unimplemented type 'list' in 'orderVector1'`

I don't know why R consider my data frame as a list. Running is.data.frame(df) returns TRUE. I have to admit that is.list(df) also returns TRUE. Is is possible to force my data frame to be only a data frame and not a list? Thanks for your help.

structure(list(ID = list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), 
               Latitude = list(50.7368, 50.7368, 50.7368, 50.7369, 50.7369, 50.737, 50.737, 50.7371, 50.7371, 50.7371), 
               Longitude = list(6.0873, 6.0873, 6.0873, 6.0872, 6.0872, 6.0872, 6.0872, 6.0872, 6.0872, 6.0872), 
               Elevation = list(269.26, 268.99, 268.73, 268.69, 268.14, 267.87, 267.61, 267.31, 267.21, 267.02), 
               Distance = list(119.4396, 119.4396, 119.4396, 121.199, 121.199, 117.5658, 117.5658, 114.9003, 114.9003, 114.9003), 
               RxPower = list(-52.6695443922406, -52.269130891243, -52.9735258244422, -52.2116571930007, -51.7784534281727, -52.7703448813654, -51.6558862949081, -52.2892907635308, -51.8322993596551, -52.4971436682333)), 
          .Names = c("ID", "Latitude", "Longitude", "Elevation", "Distance", "RxPower"),
          row.names = c(NA, 10L), class = "data.frame")
like image 664
Yann Avatar asked Jan 25 '13 12:01

Yann


3 Answers

Your data frame contains lists, not vectors. You can convert this data frame to the "classical" format using as.data.frame and unlist:

df2 <- as.data.frame(lapply(df, unlist))

Now, the new data frame could be sorted in the intended way:

df2[order(df2[, 5]), ]
like image 133
Sven Hohenstein Avatar answered Nov 09 '22 12:11

Sven Hohenstein


I've illustrated with a small example what's the problem:

df <- structure(list(ID = c(1, 2, 3, 4), 
          Latitude = c(50.7368, 50.7368, 50.7368, 50.7369), 
          Longitude = c(6.0873, 6.0873, 6.0873, 6.0872), 
          Elevation = c(269.26, 268.99, 268.73, 268.69), 
          Distance = c(119.4396, 119.4396, 119.4396, 121.199), 
          RxPower = c(-52.6695443922406, -52.269130891243, -52.9735258244422, 
                         -52.2116571930007)), 
          .Names = c("ID", "Latitude", "Longitude", "Elevation", "Distance", "RxPower"), 
          row.names = c(NA, 4L), class = "data.frame")

Notice that list only occurs once. And all the values are wrapped by c(.) and not list(.). This is why doing sapply(df, class) on your data resulted in all columns having class list.

Now,

> sapply(df, classs)
#       ID  Latitude Longitude Elevation  Distance   RxPower 
# "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" 

Now order works:

> df[order(df[,4]), ]  
#   ID Latitude Longitude Elevation Distance   RxPower
# 4  4  50.7369    6.0872    268.69 121.1990 -52.21166
# 3  3  50.7368    6.0873    268.73 119.4396 -52.97353
# 2  2  50.7368    6.0873    268.99 119.4396 -52.26913
# 1  1  50.7368    6.0873    269.26 119.4396 -52.66954
like image 44
Arun Avatar answered Nov 09 '22 12:11

Arun


This turns you data.frame of lists into a matrix:

mat <- sapply(df,unlist)

Now you can order it.

mat[order(mat[,5]),]

If all columns are of one type, e.g., numeric, a matrix often is preferable, because operations on matrices are faster than on data.frames. However, you can transform to a data.frame using as.data.frame(mat).

Btw, a data.frame is a special kind of list and thus is.list returns TRUE for every data.frame.

like image 1
Roland Avatar answered Nov 09 '22 11:11

Roland