I have a data file where individual samples are seperated by a blank line and each field is on it's own line:
age 20
weight 185
height 72
age 87
weight 109
height 60
age 15
weight 109
height 58
...
How can I read this file into a dataframe such that each row represents a sample with columns of age, weight, height?
age weight height
1 20 185 72
2 87 109 60
3 15 109 58
...
@user1317221_G showed the approach I would take, but resorted to loading an extra package and explicitly generating the groups. The groups (the ID variable) is the key to getting any reshape
type answer to work. The matrix answers don't have that limitation.
Here's a closely related approach in base R:
mydf <- read.table(header = FALSE, stringsAsFactors=FALSE,
text = "age 20
weight 185
height 72
age 87
weight 109
height 60
age 15
weight 109
height 58
")
# Create your id variable
mydf <- within(mydf, {
id <- ave(V1, V1, FUN = seq_along)
})
With an id variable, your transformation is easy:
reshape(mydf, direction = "wide",
idvar = "id", timevar="V1")
# id V2.age V2.weight V2.height
# 1 1 20 185 72
# 4 2 87 109 60
# 7 3 15 109 58
Or:
# Your ids become the "rownames" with this approach
as.data.frame.matrix(xtabs(V2 ~ id + V1, mydf))
# age height weight
# 1 20 72 185
# 2 87 60 109
# 3 15 58 109
To expand on @BlueMagister's answer you can use scan with some options to read this directly into a list, then convert the list to a data frame:
tmp <- scan(text = "
age 20
weight 185
height 72
age 87
weight 109
height 60
age 15
weight 109
height 58", multi.line=TRUE,
what=list('',0,'',0,'',0),
blank.lines.skip=TRUE)
mydf <- as.data.frame( tmp[ c(FALSE,TRUE) ] )
names(mydf) <- sapply( tmp[ c(TRUE,FALSE) ], '[', 1 )
This assumes that the variables within a record are always in the same order.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With