I have some problems with different strings being concatenated and which I would like to split again. I am dealing with things such as
name="o-n-Butylhydroxylamine1-MethylpropylhydroxylamineAmino-2-butanol"
which in this case should be split in
"o-n-Butylhydroxylamine", "1-Methylpropylhydroxylamine"
and "Amino-2-butanol"
Any thoughts how I could use strsplit
and/or gsub
regular expression to achieve this?
The rule I would like to use is that I would like to split a word when either a number, a bracket ("(") or a capital letter follows a lower caps letter. Any thoughts how to do this?
To split a string on capital letters, call the split() method with the following regular expression - /(? =[A-Z])/ . The regular expression uses a positive lookahead assertion to split the string on each capital letter and returns an array of the substrings. Copied!
Traverse the string character by character from start to end. Check the ASCII value of each character for the following conditions: If the ASCII value lies in the range of [65, 90], then it is an uppercase letter. If the ASCII value lies in the range of [97, 122], then it is a lowercase letter.
How would you check if each word in a string begins with a capital letter? The istitle() function checks if each word is capitalized.
Use the re. findall() method to split a string on uppercase letters, e.g. re. findall('[a-zA-Z][^A-Z]*', my_str) .
You could use positive look-around assertions to find (and then split at) inter-character positions preceded by a lower case letter and succeeded by an upper case letter, a digit, or a (
.
name <- "o-n-Butylhydroxylamine1-MethylpropylhydroxylamineAmino-2-butanol"
pat <- "(?<=[[:lower:]])(?=[[:upper:][:digit:](])"
strsplit(name, pat, perl=TRUE)
# [[1]]
# [1] "o-n-Butylhydroxylamine" "1-Methylpropylhydroxylamine"
# [3] "Amino-2-butanol"
strsplit(name, "(?<=([a-z]))(?=[A-Z]|[0-9]|\\()", perl=TRUE)
# [[1]]
# [1] "o-n-Butylhydroxylamine" "1-Methylpropylhydroxylamine" "Amino-2-butanol"
Remember that the return value is a list, so use [[1]]
if appropriate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With