Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subset columns based on list of column names and bring the column before it

I have a larger dataset following the same order, a unique date column, data, unique date column, date, etc. I am trying to subset not just the data column by name but the unique date column also. The code below selects columns based on a list of names, which is part of what I want but any ideas of how I can grab the column immediately before the subsetted column also?

Looking to end up with a DF containing Date1, Fire, Date3, Earth columns (using just the NameList).

Here is my reproducible code:

Cnames <- c("Date1","Fire","Date2","Water","Date3","Earth")
MAINDF <- data.frame(replicate(6,runif(120,-0.03,0.03)))
colnames(MAINDF) <- Cnames

NameList <- c("Fire","Earth")

NewDF <- MAINDF[,colnames(MAINDF) %in% NameList] 
like image 656
Trevor Nederlof Avatar asked Dec 18 '14 21:12

Trevor Nederlof


People also ask

How do I subset data by column name in R?

How to subset the data frame (DataFrame) by column value and name in R? By using R base df[] notation, or subset() you can easily subset the R Data Frame (data. frame) by column value or by column name.

How do you select a column from a list in Python?

Selecting columns using "select_dtypes" and "filter" methods To select only the float columns, use wine_df. select_dtypes(include = ['float']) . The select_dtypes method takes in a list of datatypes in its include parameter. The list values can be a string or a Python object.

How do I select multiple columns by name in R?

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.

How to subset The Dataframe by the list of column names?

To subset columns use select argument with values as column names to subset (). Similarly, let’s see how to subset the DataFrame by the list of column names in R.

How to create a subset of a column in Python?

Method 1: Using Python iloc() function . This function allows us to create a subset by choosing specific values from columns based on indexes. Syntax: df_name.iloc[beg_index:end_index+1,beg_index:end_index+1] Example: Create a subset with Name, Gender and Branch column

What is the best way to select specific columns in Excel?

As a general note, filter is a very flexible and powerful way to select specific columns. In particular, you can use regular expressions. Borrowing the sample data from @jezrael, you could type either of the following.

How do I select a column with a prefix in pandas?

Select Columns with a Prefix using Pandas filter For example, if we are interested in selecting columns starting with “lifeExp”, the regular expression for the pattern is “^lifeExp”. In the regular expression “^” represents we are interested in patterns that starts with. So our argument for “regexp” will be regexp=’^lifeExp’.


2 Answers

How about

NameList <- c("Fire","Earth")

idx <- match(NameList, names(MAINDF))
idx <- sort(c(idx-1, idx))

NewDF <- MAINDF[,idx] 

Here we use match() to find the index of the desired column, and then we can use index subtraction to grab the column before it

like image 197
MrFlick Avatar answered Sep 21 '22 17:09

MrFlick


Use which to get the column numbers from the names, and then it's just simple arithmetic:

col.num <- which(colnames(MAINDF) %in% NameList)
NewDF <- MAINDF[,sort(c(col.num, col.num - 1))]

Produces

         Date1         Fire        Date3        Earth
1 -0.010908003  0.007700453 -0.022778726 -0.016413307
2  0.022300509  0.021341360  0.014204445 -0.004492150
3 -0.021544992  0.014187158 -0.015174048 -0.000495121
4 -0.010600955 -0.006960160 -0.024535954 -0.024210771
5 -0.004694499  0.007198620  0.005543146 -0.021676692
6 -0.010623787  0.015977135 -0.027741109 -0.021102651
...
like image 26
BrodieG Avatar answered Sep 18 '22 17:09

BrodieG