I have a string whose structure and length can keep varying, that is
Input:
X <- ("A=12&B=15&C=15")
Y <- ("A=12&B=15&C=15&D=32&E=53")
What I was looking for this string to convert to data frame
Output Expected:
Dataframe X
A B C
12 15 15
and Dataframe Y
A B C D E
12 15 15 32 53
What I tired was this:
X <- as.data.frame(strsplit(X, split="&"))
But this didn't work for me, as it created only one column and column name was messed up.
P.S: I cannot hard code the column names because they can vary, and at any given time a string will contain only one row
One option is to extract the numeric part from the string, and read it with read.table
. The pattern [^0-9]+
indicates one or more characters that are not a number and replace it with a space in the first gsub
, read that using read.table
, and specify the column names in the col.names
argument with the values got by removing all characters that are not an upper case letter (second gsub
)
f1 <- function(str1){
read.table(text=gsub("[^0-9]+", " ", str1),
col.names = scan(text=trimws(gsub("[^A-Z]+", " ", str1)),
what = "", sep=" ", quiet=TRUE))
}
f1(X)
# A B C
#1 12 15 15
f1(Y)
# A B C D E
#1 12 15 15 32 53
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With