Is there an equivalent R function to Stata 'order' command?

Tags:

stata

'order' in R seems like 'sort' in Stata. Here's a dataset for example (only variable names listed):

v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18

and here's the output I expect:

v1 v2 v3 v4 v5 v7 v8 v9 v10 v11 v12 v17 v18 v13 v14 v15 v6 v16

In R, I have 2 ways:

data <- data[,c(1:5,7:12,17:18,13:15,6,16)]

names <- c("v1", "v2", "v3", "v4", "v5", "v7", "v8", "v9", "v10", "v11", "v12",  "v17", "v18", "v13", "v14", "v15", "v6", "v16")
data <- data[names]

To get the same output in Stata, I may run 2 lines:

order v17 v18, before(v13)
order v6 v16, last

In the ideal data above, we can know the positions of the variables we want to deal with. But in most real cases, we have variables like 'age' 'gender' with no position indicators and we may have more than 50 variables in one dataset. Then the advantage of 'order' in Stata could be more obvious. We don't need to know the exact place of the variable and just type its name:

order age, after(gender)

Is there a base function in R to deal with this issue or could I get a package? Thanks in advance.

tweetinfo <- data.frame(uid=1:50, mid=2:51, annotations=3:52, bmiddle_pic=4:53, created_at=5:54, favorited=6:55, geo=7:56, in_reply_to_screen_name=8:57, in_reply_to_status_id=9:58, in_reply_to_user_id=10:59, original_pic=11:60, reTweetId=12:61, reUserId=13:62, source=14:63, thumbnail_pic=15:64, truncated=16:65)
noretweetinfo <- data.frame(uid=21:50, mid=22:51, annotations=23:52, bmiddle_pic=24:53, created_at=25:54, favorited=26:55, geo=27:56, in_reply_to_screen_name=28:57, in_reply_to_status_id=29:58, in_reply_to_user_id=30:59, original_pic=31:60, reTweetId=32:61, reUserId=33:62, source=34:63, thumbnail_pic=35:64, truncated=36:65)
retweetinfo <- data.frame(uid=41:50, mid=42:51, annotations=43:52, bmiddle_pic=44:53, created_at=45:54, deleted=46:55, favorited=47:56, geo=48:57, in_reply_to_screen_name=49:58, in_reply_to_status_id=50:59, in_reply_to_user_id=51:60, original_pic=52:61, source=53:62, thumbnail_pic=54:63, truncated=55:64)
tweetinfo$type <- "ti"
noretweetinfo$type <- "nr"
retweetinfo$type <- "rt"
gtinfo <- rbind(tweetinfo, noretweetinfo)
gtinfo$deleted=""
gtinfo <- gtinfo[,c(1:16,18,17)]
retweetinfo <- transform(retweetinfo, reTweetId="", reUserId="")
retweetinfo <- retweetinfo[,c(1:5,7:12,17:18,13:15,6,16)]
gtinfo <- rbind(gtinfo, retweetinfo)
write.table(gtinfo, file="C:/gtinfo.txt", row.names=F, col.names=T, sep="\t", quote=F)
# rm(list=ls(all=T))

520

asked Sep 22 '12 14:09

leoce

1 Answers

Because I'm procrastinating and experimenting with different things, here's a function that I whipped up. Ultimately, it depends on append:

moveme <- function(invec, movecommand) {
  movecommand <- lapply(strsplit(strsplit(movecommand, ";")[[1]], ",|\\s+"), 
                        function(x) x[x != ""])
  movelist <- lapply(movecommand, function(x) {
    Where <- x[which(x %in% c("before", "after", "first", "last")):length(x)]
    ToMove <- setdiff(x, Where)
    list(ToMove, Where)
  })
  myVec <- invec
  for (i in seq_along(movelist)) {
    temp <- setdiff(myVec, movelist[[i]][[1]])
    A <- movelist[[i]][[2]][1]
    if (A %in% c("before", "after")) {
      ba <- movelist[[i]][[2]][2]
      if (A == "before") {
        after <- match(ba, temp)-1
      } else if (A == "after") {
        after <- match(ba, temp)
      }    
    } else if (A == "first") {
      after <- 0
    } else if (A == "last") {
      after <- length(myVec)
    }
    myVec <- append(temp, values = movelist[[i]][[1]], after = after)
  }
  myVec
}

Here's some sample data representing the names of your dataset:

x <- paste0("v", 1:18)

Imagine now that we wanted "v17" and "v18" before "v3", "v6" and "v16" at the end, and "v5" at the beginning:

moveme(x, "v17, v18 before v3; v6, v16 last; v5 first")
#  [1] "v5"  "v1"  "v2"  "v17" "v18" "v3"  "v4"  "v7"  "v8"  "v9"  "v10" "v11" "v12"
# [14] "v13" "v14" "v15" "v6"  "v16"

So, the obvious usage would be, for a data.frame named "df":

df[moveme(names(df), "how you want to move the columns")]

And, for a data.table named "DT" (which, as @mnel points out, would be more memory efficient):

setcolorder(DT, moveme(names(DT), "how you want to move the columns"))

Note that compound moves are specified by semicolons.

The recognized moves are:

before (move the specified columns to before another named column)
after (move the specified columns to after another named column)
first (move the specified columns to the first position)
last (move the specified columns to the last position)

135

answered Nov 15 '22 07:11

A5C1D2H2I1M1N2O1R2T1

Related questions
                            
                                available CRAN vignettes
                            
                                Why doesn't R's heatmap function color cells consistently?
                            
                                Is there any package that contains NFL football data?
                            
                                R: IDs in map2SpatialPolygons
                            
                                R: removing the last three dots from a string
                            
                                quantmod::chart_Series() bug?
                            
                                Removing temporary files created by pdf()
                            
                                R: In ggplot, how to add multiple text labels on the y-axis for each of multiple dates on the x-axis
                            
                                How do I clear an NA flag for a posix value?
                            
                                Recreating a ggplot2 geom_point() using base graphics
                            
                                ggplot2: set (nonlinear) values for alpha
                            
                                Generating a heatmap that depicts the clusters in a dataset using hierarchical clustering in R
                            
                                R passing data frame to another program using system()
                            
                                Creating a new dataframe based on another data frame (Using a loop or otherwise)
                            
                                R: Multiple lattice levelplots from matrices
                            
                                control of title parameters of a plot in R
                            
                                how to get files out of the git tab if rstudio
                            
                                Add Points to Choropleth Map in ggplot2
                            
                                Superimposing a log-normal density in ggplot and stat_function()
                            
                                Removing lines through symbols in legend in r

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With