I have
x<-"1, A | 2, B | 10, C "
x
is always this way formatted, |
denotes a new row and the first value is the variable1
, the second value is variable2
.
I would like to convert it to a data.frame
variable1 variable2
1 1 A
2 2 B
3 10 C
I haven't found any package that can understand the escape character |
How can I convert it to data.frame
?
Use the % Formatting to Print With Column Alignment in Python. The % method is one of the most common and oldest ways to format strings and get results in the required style. We can use the %-*s to specify the spacing that will work as the column width. The spacing needs to be adjusted for every row.
Example 3: Convert an Entire DataFrame to Strings Lastly, we can convert every column in a DataFrame to strings by using the following syntax: #convert every column to strings df = df.astype (str) #check data type of each column df.dtypes player object points object assists object dtype: object
It will act as a wrapper and it will help use read the data using the pd.read_csv () function. As we can see in the output, we have successfully read the given data in string format into a Pandas DataFrame. Solution 2 : Another fantastic approach is to use the pandas pd.read_clipboard () function.
You can then use the astype (float) approach to perform the conversion into floats: In the context of our example, the ‘DataFrame Column’ is the ‘Price’ column. And so, the full code to convert the values to floats would be: You’ll now see that the ‘Price’ column has been converted into a float:
There are three different ways to perform string formatting:- 1 Formatting with placeholders. 2 Formatting with.format () string method. 3 Formatting with string literals, called f-strings.
We may use read.table
from base R
to read the string into two columns after replacing the |
with \n
read.table(text = gsub("|", "\n", x, fixed = TRUE), sep=",",
header = FALSE, col.names = c("variable1", "variable2"), strip.white = TRUE )
-output
variable1 variable2
1 1 A
2 2 B
3 10 C
Or use fread
from data.table
library(data.table)
fread(gsub("|", "\n", x, fixed = TRUE), col.names = c("variable1", "variable2"))
variable1 variable2
1: 1 A
2: 2 B
3: 10 C
Or using tidyverse
- separate_rows
to split the column and then create two columns with separate
library(tidyr)
library(dplyr)
tibble(x = trimws(x)) %>%
separate_rows(x, sep = "\\s*\\|\\s*") %>%
separate(x, into = c("variable1", "variable2"), sep=",\\s+", convert = TRUE)
# A tibble: 3 × 2
variable1 variable2
<int> <chr>
1 1 A
2 2 B
3 10 C
Here's a way using scan()
.
x <- "1, A | 2, B | 10, C "
do.call(rbind.data.frame,
strsplit(scan(text=x, what="A", sep='|', quiet=T, strip.white=T), ', ')) |>
setNames(c('variable1', 'variable2'))
# variable1 variable2
# 1 1 A
# 2 2 B
# 3 10 C
Note: R version 4.1.2 (2021-11-01)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With