I'm trying to add underscore before every capital letter followed by lower case. Here is the example:
cases <- c("XrefAcctnoAcctID", "NewXref1AcctID", "NewXref2AcctID", "ClientNo")
I have this:
[1] "XrefAcctnoAcctID" "NewXref1AcctID"  
[3] "NewXref2AcctID"   "ClientNo"     
And I want to have this:
"xref_acctno_acct_id" 
"new_xref1_acct_id"   
"new_xref2_acct_id"    
"client_no" 
I'm able to go this far:
> tolower(gsub("([a-z])([A-Z])", "\\1_\\2", cases))
[1] "xref_acctno_acct_id" "new_xref1acct_id"   
[3] "new_xref2acct_id"    "client_no" 
But "new_xref1acct_id" "new_xref2acct_id" does not reflect what I want.
We can use regex lookarounds to match the patterns that show a lowercase letter or a number followed by an upper case letter and replace it with _
tolower(gsub("(?<=[a-z0-9])(?=[A-Z])", "_", cases, perl = TRUE))
#[1] "xref_acctno_acct_id" "new_xref1_acct_id"   "new_xref2_acct_id"  
#[4] "client_no"  
Or without lookarounds, we can capture the lower case or numbers as a group followed by upper case letter as another group and replace it with backreference for that group separated by _
tolower(gsub("([a-z1-9])([A-Z])", "\\1_\\2", cases))
#[1] "xref_acctno_acct_id" "new_xref1_acct_id"   "new_xref2_acct_id"  
#[4] "client_no"       
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With