I would like to delete the last character of a variable.
I was wondering if it is possible to select the position with gsub
and delete the character at this particular position.
In this example, I want to delete the last digit in the end, after the E
, for my 4 variables.
variables = c('B10243E1', 'B10243E2', 'B10243E3', 'B10243E4')
gsub(pattern = '[[:xdigit:]]{8}.', replacement = '', x = variables)
I thought we could use the command
{}
in order to select a specific position.
You can do it by capturing all the characters but the last:
variables = c('B10243E1', 'B10243E2', 'B10243E3', 'B10243E4')
gsub('^(.*).$', '\\1', variables)
Explanation:
^
- Start of the string(.*)
- All characters but a newline up to.$
- The last character (captured with .
) before the end of string ($
).Thus, this regex is good to use if you plan to remove the final character, and the string does not contain newline.
See demo
Output:
[1] "B10243E" "B10243E" "B10243E" "B10243E"
To only replace the 8th character (here is a sample where I added T
at the end of each item):
variables = c('B10247E1T', 'B10243E2T', 'B10243E3T', 'B10243E4T')
gsub('^(.{7}).', '\\1', variables)
Output of the sample program (not ET
at the end of each item, the digit was removed):
[1] "B10247ET" "B10243ET" "B10243ET" "B10243ET"
Try any of these. The first removes the last character, the second replaces E and anything after it with E, the third returns the first 7 characters assuming there are 8 characters, the remaining each return the first 7 characters. All are vectorized, i.e. variables
may be a vector of character strings as in the question.
sub(".$", "", variables)
sub("E.*", "E", variables)
sub("^(.{7}).", "\\1", variables)
sub("^(.{7}).*", "\\1", variables)
substr(variables, 1, 7)
substring(variables, 1, 7)
trimws("abc333", "right", "\\d") # requires R 3.6 (currently r-devel)
Here is a visualization of the regular expression in the third solution:
^(.{7}).
Debuggex Demo
and there is a visualization of the regular expression in the fourth solution:
^(.{7}).*
Debuggex Demo
If you always want to remove after E
you can capture everything after it and replace by E
sub("E(.*)", 'E', variables)
## [1] "B10243E" "B10243E" "B10243E" "B10243E"
Alternatively, you can count 7 characters using positive look behind and remove everything after
sub("(?<=.{7})(.)", "", variables, perl = TRUE)
## [1] "B10243E" "B10243E" "B10243E" "B10243E"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With