I have a data set wherein a column looks like this:
ABC|DEF|GHI, ABCD|EFG|HIJK, ABCDE|FGHI|JKL, DEF|GHIJ|KLM, GHI|JKLM|NO|PQRS, BCDE|FGHI|JKL
.... and so on
I need to extract the characters that appear before the first |
symbol.
In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr()
.
The syntax is - substr(x, <start>,<stop>)
In my case, start will always be 1. For stop, we need to search by |
. How can we achieve this? Are there alternate ways to do this?
Use the substring() method to get the substring before a specific character, e.g. const before = str. substring(0, str. indexOf('_')); . The substring method will return a new string containing the part of the string before the specified character.
The substr() method extracts a part of a string. The substr() method begins at a specified position, and returns a specified number of characters. The substr() method does not change the original string. To extract characters from the end of the string, use a negative start position.
Python Substring Before Character You can extract a substring from a string before a specific character using the rpartition() method. What is this? rpartition() method partitions the given string based on the last occurrence of the delimiter and it generates tuples that contain three elements where.
We can use sub
sub("\\|.*", "", str1) #[1] "ABC"
Or with strsplit
strsplit(str1, "[|]")[[1]][1] #[1] "ABC"
If we use the data from @hrbrmstr
sub("\\|.*", "", df$V1) #[1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"
These are all base R methods. No external packages used.
str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"
Another option word
function of stringr
package
library(stringr) word(df1$V1,1,sep = "\\|")
Data
df1 <- read.table(text = "ABC|DEF|GHI, ABCD|EFG|HIJK, ABCDE|FGHI|JKL, DEF|GHIJ|KLM, GHI|JKLM|NO|PQRS, BCDE|FGHI|JKL")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With