I have a data frame that looks like:
ID Time U1 U2 U3 U4 ...
1 20 1 2 3 5 ..
2 20 2 5 9 4 ..
3 20 2 5 6 4 ..
.
.
And I would need to keep it like:
ID Time U
1 20 1
1 20 2
1 20 3
1 20 5
2 20 2
2 20 5
2 20 9
2 20 4
3 20 2
3 20 5
3 20 6
3 20 4
I tried with:
X <- read.table("mydata.txt", header=TRUE, sep=",")
X_D <- as.data.frame(X)
X_new <- stack(X_D, select = -c(ID, Time))
But I haven't managed to get the data into that form. Honestly, I have little experience with stacking/transposing, so any help is greatly appreciated!
Convert multiple columns into a single column, To combine numerous data frame columns into one column, use the union() function from the tidyr package.
Method 1: Using stack method The cbind() operation is used to stack the columns of the data frame together. Initially, the first two columns of the data frame are combined together using the df[1:2]. This is followed by the application of stack() method applied on the last two columns.
Data Visualization using R Programming To divide each column by a particular column, we can use division sign (/). For example, if we have a data frame called df that contains three columns say x, y, and z then we can divide all the columns by column z using the command df/df[,3].
How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.
Here's the stack
approach:
dat2a <- data.frame(dat[1:2], stack(dat[3:ncol(dat)]))
dat2a
# ID Time values ind
# 1 1 20 1 U1
# 2 2 20 2 U1
# 3 3 20 2 U1
# 4 1 20 2 U2
# 5 2 20 5 U2
# 6 3 20 5 U2
# 7 1 20 3 U3
# 8 2 20 9 U3
# 9 3 20 6 U3
# 10 1 20 5 U4
# 11 2 20 4 U4
# 12 3 20 4 U4
This is very similar to melt
from "reshape2":
library(reshape2)
dat2b <- melt(dat, id.vars=1:2)
dat2b
# ID Time variable value
# 1 1 20 U1 1
# 2 2 20 U1 2
# 3 3 20 U1 2
# 4 1 20 U2 2
# 5 2 20 U2 5
# 6 3 20 U2 5
# 7 1 20 U3 3
# 8 2 20 U3 9
# 9 3 20 U3 6
# 10 1 20 U4 5
# 11 2 20 U4 4
# 12 3 20 U4 4
And, very similar to @TylerRinker's answer, but not dropping the "times", is to just use sep = ""
to help R guess time and variable names.
dat3 <- reshape(dat, direction = "long", idvar=1:2,
varying=3:ncol(dat), sep = "", timevar="Measure")
dat3
# ID Time Measure U
# 1.20.1 1 20 1 1
# 2.20.1 2 20 1 2
# 3.20.1 3 20 1 2
# 1.20.2 1 20 2 2
# 2.20.2 2 20 2 5
# 3.20.2 3 20 2 5
# 1.20.3 1 20 3 3
# 2.20.3 2 20 3 9
# 3.20.3 3 20 3 6
# 1.20.4 1 20 4 5
# 2.20.4 2 20 4 4
# 3.20.4 3 20 4 4
In all three of those, you end up with four columns, not three, like you describe in your desired output. However, as @ndoogan points out, by doing so, you're loosing information about your data. If you're fine with that, you can always drop that column from the resulting data.frame
quite easily (for example, dat2a <- dat2a[-4]
.
With base reshape
:
dat <- read.table(text="ID Time U1 U2 U3 U4
1 20 1 2 3 5
2 20 2 5 9 4
3 20 2 5 6 4", header=TRUE)
colnames(dat) <- gsub("([a-zA-Z]*)([0-9])", "\\1.\\2", colnames(dat))
reshape(dat, varying=3:ncol(dat), v.names="U", direction ="long", timevar = "Time",
idvar = "ID")
You can also use melt():
library(reshape2)
new_data <- melt(old_data, id.vars=c("ID","Time"),
value.name = "U")
Then remove the 'variable' column:
new_data$variable <- NULL
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With