Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R split array into Data frame

Tags:

r

VERY new to R and struggling with knowing exactly what to ask, have found a similar question here How to split a character vector into data frame? but this has fixed length, and I've been unable to adjust for my problem

I've got some data in an array in R

TEST <- c("Value01:100|Value02:200|Value03:300|","Value04:1|Value05:2|",
            "StillAValueButNamesAreNotConsistent:12345.6789|",
              "AlsoNotAllLinesAreTheSameLength:1|")

The data is stored in pairs, and I'm looking to split out into a dataframe as such:

Variable Value
Value01    100
Value02    200
Value03    300
Value04    1
Value05    2
StillAValueButNamesAreNotConsistent   12345.6789
AlsoNotAllLinesAreTheSameLength     1

The Variable name is a string and the value will always be a number

Any help would be great!

Thanks

like image 295
LPD Avatar asked Jul 28 '18 16:07

LPD


People also ask

How do you split data into a Dataframe in R?

Use the split() function in R to split a vector or data frame. Use the unsplit() method to retrieve the split vector or data frame.

How do I separate values in R?

To split a column into multiple columns in the R Language, we use the separator() function of the dplyr package library. The separate() function separates a character column into multiple columns with a regular expression or numeric locations.

How do I convert a list to a Dataframe in R?

To convert List to Data Frame in R, call as. data. frame() function and pass the list as argument to it.


1 Answers

One can use tidyr based solution. Convert vector TEST to a data.frame and remove the last | from each row as that doesn't carry any meaning as such.

Now, use tidyr::separate_rows to expand rows based on | and then separate data in 2 columns using tidyr::separate function.

library(dplyr)
library(tidyr)

data.frame(TEST) %>%
  mutate(TEST = gsub("\\|$","",TEST)) %>%
  separate_rows(TEST, sep = "[|]") %>%
  separate(TEST, c("Variable", "Value"), ":")

#                              Variable      Value
# 1                             Value01        100
# 2                             Value02        200
# 3                             Value03        300
# 4                             Value04          1
# 5                             Value05          2
# 6 StillAValueButNamesAreNotConsistent 12345.6789
# 7     AlsoNotAllLinesAreTheSameLength          1
like image 120
MKR Avatar answered Sep 29 '22 01:09

MKR