Binding columns with similar column names in the same dataframe in R

Tags:

r

I have a data frame that looks somewhat like this:

df <- data.frame(0:2, 1:3, 2:4, 5:7, 6:8, 2:4, 0:2, 1:3, 2:4)
colnames(df) <- rep(c('a', 'b', 'c'), 3)
> df
  a b c a b c a b c
1 0 1 2 5 6 2 0 1 2
2 1 2 3 6 7 3 1 2 3
3 2 3 4 7 8 4 2 3 4

There are multiple columns that have the same name. I would like to rearrange the data frame so that the columns with the same names combine into their own supercolumn, so that there are only unique column names left, for example:

Any thoughts on how to do this? Thanks in advance!

255

asked Apr 04 '13 05:04

tkvn

2 Answers

This will do the trick, I suppose.

Explanation

df[,names(df) == 'a'] will select all columns with name a

unlist will convert above columns into 1 single vector

unname will remove some stray rownames given to these vectors.

unique(names(df)) will give you unique column names in df

sapply will apply the inline function to all values of unique(names(df))

> df
  a b c a b c a b c
1 0 1 2 5 6 2 0 1 2
2 1 2 3 6 7 3 1 2 3
3 2 3 4 7 8 4 2 3 4
> sapply(unique(names(df)), function(x) unname(unlist(df[,names(df)==x])))
      a b c
 [1,] 0 1 2
 [2,] 1 2 3
 [3,] 2 3 4
 [4,] 5 6 2
 [5,] 6 7 3
 [6,] 7 8 4
 [7,] 0 1 2
 [8,] 1 2 3
 [9,] 2 3 4

119

answered Oct 22 '22 14:10

CHP

My version:

library(reshape)
as.data.frame(with(melt(df), split(value, variable)))
  a b c
1 0 1 2
2 1 2 3
3 2 3 4
4 0 1 2
5 1 2 3
6 2 3 4
7 0 1 2
8 1 2 3
9 2 3 4

In the step using melt I transform the dataset:

> melt(df)
Using  as id variables
   variable value
1         a     0
2         a     1
3         a     2
4         b     1
5         b     2
6         b     3
7         c     2
8         c     3
9         c     4
10        a     0
11        a     1
12        a     2
13        b     1
14        b     2
15        b     3
16        c     2
17        c     3
18        c     4
19        a     0
20        a     1
21        a     2
22        b     1
23        b     2
24        b     3
25        c     2
26        c     3
27        c     4

Then I split up the value column for each unique level of variable using split:

$a
[1] 0 1 2 0 1 2 0 1 2

$b
[1] 1 2 3 1 2 3 1 2 3

$c
[1] 2 3 4 2 3 4 2 3 4

then this only needs an as.data.frame to become the data structure you need.

answered Oct 22 '22 15:10

Paul Hiemstra

Related questions
                            
                                Levels in R Dataframe
                            
                                S4 missing or NULL arguments to methods?
                            
                                multiple histograms on top of eachother without bins
                            
                                Write a function to remove object if it exists
                            
                                Chi Square Analysis using for loop in R
                            
                                How to factorize specific columns in a data.frame in R using apply
                            
                                "Cannot open the connection" - HPC in R with snow
                            
                                Efficiently locate group-wise constant columns in a data.frame
                            
                                R Regular Expression Lookbehind
                            
                                How do I retrieve a simple numeric value from a named numeric vector in R?
                            
                                R - Speeding up approximate date match. idata.frame?
                            
                                how to write micrometer squared per cubic meter in plot label in R
                            
                                geom_vline with Character xintercept
                            
                                frequency table with several variables in R
                            
                                determining row indices of data.table group members
                            
                                How to reshape a dataframe with "reoccurring" columns?
                            
                                web based interpreter for language R [closed]
                            
                                R randomForest for classification
                            
                                Find the indices of the odd numbers in an integer vector
                            
                                Change the column type of data frame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With