I am trying to split a character vector into three different vectors, inside a data frame.
My data is something like:
> df <- data.frame(filename = c("Author1 (2010) Title of paper",
"Author2 et al (2009) Title of paper",
"Author3 & Author4 (2004) Title of paper"),
stringsAsFactors = FALSE)
And I would like to split those 3 informations (authors
, year
, title
) into three different columns, so that it would be:
> df
filename author year title
1 Author1 (2010) Title1 Author1 2010 Title1
2 Author2 et al (2009) Title2 Author2 et al 2009 Title2
3 Author3 & Author4 (2004) Title3 Author3 & Author4 2004 Title3
I have used strsplit
to split each filename
in a vector of 3 elements:
df$temp <- strsplit(df$filename, " \\(|\\) ")
But now, I can't find a way to put each element in a separate column. I can access a specific information like that:
> df$temp[[2]][1]
[1] "Author2 et al"
but can't find how to put it in the other columns
> df$author <- df$temp[[]][1]
Error
You could try tstrsplit
from the devel version of data.table
library(data.table)#v1.9.5+
setDT(df)[, c('author', 'year', 'title') :=tstrsplit(filename, ' \\(|\\) ')]
df
# filename author year
#1: Author1 (2010) Title of paper Author1 2010
#2: Author2 et al (2009) Title of paper Author2 et al 2009
#3: Author3 & Author4 (2004) Title of paper Author3 & Author4 2004
# title
#1: Title of paper
#2: Title of paper
#3: Title of paper
Edit: Included OP's split pattern to remove the white spaces.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With