Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a properly formatted string to data frame

Tags:

r

I have

x<-"1, A | 2, B | 10, C "

x is always this way formatted, | denotes a new row and the first value is the variable1, the second value is variable2.

I would like to convert it to a data.frame

  variable1 variable2
1         1         A
2         2         B
3        10         C

I haven't found any package that can understand the escape character |

How can I convert it to data.frame?

like image 829
ECII Avatar asked Dec 25 '21 20:12

ECII


People also ask

How do I format a string in a column in Python?

Use the % Formatting to Print With Column Alignment in Python. The % method is one of the most common and oldest ways to format strings and get results in the required style. We can use the %-*s to specify the spacing that will work as the column width. The spacing needs to be adjusted for every row.

How to convert an entire Dataframe to strings?

Example 3: Convert an Entire DataFrame to Strings Lastly, we can convert every column in a DataFrame to strings by using the following syntax: #convert every column to strings df = df.astype (str) #check data type of each column df.dtypes player object points object assists object dtype: object

How to read data in string format into a pandas Dataframe?

It will act as a wrapper and it will help use read the data using the pd.read_csv () function. As we can see in the output, we have successfully read the given data in string format into a Pandas DataFrame. Solution 2 : Another fantastic approach is to use the pandas pd.read_clipboard () function.

How do I convert a Dataframe column to a float?

You can then use the astype (float) approach to perform the conversion into floats: In the context of our example, the ‘DataFrame Column’ is the ‘Price’ column. And so, the full code to convert the values to floats would be: You’ll now see that the ‘Price’ column has been converted into a float:

What are the different ways of string formatting?

There are three different ways to perform string formatting:- 1 Formatting with placeholders. 2 Formatting with.format () string method. 3 Formatting with string literals, called f-strings.


Video Answer


2 Answers

We may use read.table from base R to read the string into two columns after replacing the | with \n

read.table(text = gsub("|", "\n", x, fixed = TRUE), sep=",", 
    header = FALSE, col.names = c("variable1", "variable2"), strip.white = TRUE )

-output

 variable1 variable2
1         1        A 
2         2        B 
3        10        C 

Or use fread from data.table

library(data.table)
fread(gsub("|", "\n", x, fixed = TRUE), col.names = c("variable1", "variable2"))
   variable1 variable2
1:         1         A
2:         2         B
3:        10         C

Or using tidyverse - separate_rows to split the column and then create two columns with separate

library(tidyr)
library(dplyr)
tibble(x = trimws(x)) %>% 
  separate_rows(x, sep = "\\s*\\|\\s*") %>%
  separate(x, into = c("variable1", "variable2"), sep=",\\s+", convert = TRUE)
# A tibble: 3 × 2
  variable1 variable2
      <int> <chr>    
1         1 A        
2         2 B        
3        10 C      
like image 62
akrun Avatar answered Oct 20 '22 09:10

akrun


Here's a way using scan().

x <- "1, A | 2, B | 10, C "

do.call(rbind.data.frame,
        strsplit(scan(text=x, what="A", sep='|', quiet=T, strip.white=T), ', ')) |>
  setNames(c('variable1', 'variable2'))
#   variable1 variable2
# 1         1         A
# 2         2         B
# 3        10         C

Note: R version 4.1.2 (2021-11-01).

like image 2
jay.sf Avatar answered Oct 20 '22 10:10

jay.sf