I am trying to use str_split to split the following observations into a specific format.
"00010943900008" "00010946803119" "00010946803219" "00010946803219" "00010946803219" "00010948700007"
I am trying to split it into different columns.
So that the first observation will look something like the following:
Column x = 00
Column y = 01
Column z = 09439
Column w = 00008
Where column x will always be the first 2 numbers in the observation, column y will be the next 2 numbers, column z will be the next 5 numbers and column w will be the final 5 numbers
Data
string <- c("00010943900008", "00010946803119", "00010946803219", "00010946803219",
"00010946803219", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016"
)
To split a column into multiple columns in the R Language, We use the str_split_fixed() function of the stringr package library. The str_split_fixed() function splits up a string into a fixed number of pieces.
You can either concatenate your data with \n
as a separator or write it to file, then use readr::read_fwf
or read.fwf
(from file only) to import it as a fixed width format. Here's the readr::read_fwf
version without writing to disk:
library(readr)
result = read_fwf(paste(string, collapse = "\n"),
col_positions = fwf_widths(c(2, 2, 5, 5), col_names = c("x", "y", "z", "w")))
head(result)
# # A tibble: 6 x 4
# x y z w
# <chr> <chr> <chr> <chr>
# 1 00 01 09439 00008
# 2 00 01 09468 03119
# 3 00 01 09468 03219
# 4 00 01 09468 03219
# 5 00 01 09468 03219
# 6 00 01 09487 00007
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With