Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split data.frame into groups by column name

I'm new to R. I have a data frame with column names of such type:

file_001   file_002   block_001   block_002   red_001   red_002 ....etc'  
  0.05       0.2        0.4         0.006       0.05       0.3
  0.01       0.87       0.56        0.4         0.12       0.06

I want to split them into groups by the column name, to get a result like this:

group_file
file_001   file_002
  0.05       0.2
  0.01       0.87

group_block
block_001   block_002
  0.4        0.006
  0.56       0.4

group_red
red_001    red_002
  0.05       0.3
  0.12       0.06

...etc'

My file is huge. I don't have a certain number of groups. It needs to be just by the column name's start.

like image 266
Keity Avatar asked Nov 14 '17 14:11

Keity


People also ask

How do you split data frame by column?

In the above example, the data frame 'df' is split into 2 parts 'df1' and 'df2' on the basis of values of column 'Weight'. Method 2: Using Dataframe. groupby(). This method is used to split the data into groups based on some criteria.

How do you split names in a data frame?

Use underscore as delimiter to split the column into two columns. # Adding two new columns to the existing dataframe. # splitting is done on the basis of underscore.


1 Answers

In base R, you can use sub and split.default like this to return a list of data.frames:

myDfList <- split.default(dat, sub("_\\d+", "", names(dat)))

this returns

myDfList
$block
  block_001 block_002
1      0.40     0.006
2      0.56     0.400

$file
  file_001 file_002
1     0.05     0.20
2     0.01     0.87

$red
  red_001 red_002
1    0.05    0.30
2    0.12    0.06

split.default will split data.frames by variable according to its second argument. Here, we use sub and the regular expression "_\d+" to remove the underscore and all numeric values following it in order to return the splitting values "block", "file", and "red".

As a side note, it is typically a good idea to keep these data.frames in a list and work with them through functions like lapply. See gregor's answer to this post for some motivating examples.

like image 166
lmo Avatar answered Oct 19 '22 04:10

lmo