I have a larger dataset following the same order, a unique date column, data, unique date column, date, etc. I am trying to subset not just the data column by name but the unique date column also. The code below selects columns based on a list of names, which is part of what I want but any ideas of how I can grab the column immediately before the subsetted column also?
Looking to end up with a DF containing Date1, Fire, Date3, Earth columns (using just the NameList).
Here is my reproducible code:
Cnames <- c("Date1","Fire","Date2","Water","Date3","Earth")
MAINDF <- data.frame(replicate(6,runif(120,-0.03,0.03)))
colnames(MAINDF) <- Cnames
NameList <- c("Fire","Earth")
NewDF <- MAINDF[,colnames(MAINDF) %in% NameList]
How to subset the data frame (DataFrame) by column value and name in R? By using R base df[] notation, or subset() you can easily subset the R Data Frame (data. frame) by column value or by column name.
Selecting columns using "select_dtypes" and "filter" methods To select only the float columns, use wine_df. select_dtypes(include = ['float']) . The select_dtypes method takes in a list of datatypes in its include parameter. The list values can be a string or a Python object.
To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.
To subset columns use select argument with values as column names to subset (). Similarly, let’s see how to subset the DataFrame by the list of column names in R.
Method 1: Using Python iloc() function . This function allows us to create a subset by choosing specific values from columns based on indexes. Syntax: df_name.iloc[beg_index:end_index+1,beg_index:end_index+1] Example: Create a subset with Name, Gender and Branch column
As a general note, filter is a very flexible and powerful way to select specific columns. In particular, you can use regular expressions. Borrowing the sample data from @jezrael, you could type either of the following.
Select Columns with a Prefix using Pandas filter For example, if we are interested in selecting columns starting with “lifeExp”, the regular expression for the pattern is “^lifeExp”. In the regular expression “^” represents we are interested in patterns that starts with. So our argument for “regexp” will be regexp=’^lifeExp’.
How about
NameList <- c("Fire","Earth")
idx <- match(NameList, names(MAINDF))
idx <- sort(c(idx-1, idx))
NewDF <- MAINDF[,idx]
Here we use match()
to find the index of the desired column, and then we can use index subtraction to grab the column before it
Use which
to get the column numbers from the names, and then it's just simple arithmetic:
col.num <- which(colnames(MAINDF) %in% NameList)
NewDF <- MAINDF[,sort(c(col.num, col.num - 1))]
Produces
Date1 Fire Date3 Earth
1 -0.010908003 0.007700453 -0.022778726 -0.016413307
2 0.022300509 0.021341360 0.014204445 -0.004492150
3 -0.021544992 0.014187158 -0.015174048 -0.000495121
4 -0.010600955 -0.006960160 -0.024535954 -0.024210771
5 -0.004694499 0.007198620 0.005543146 -0.021676692
6 -0.010623787 0.015977135 -0.027741109 -0.021102651
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With