I have a file of baby names that I am reading in and then trying to get the last character in the baby name. For example, the file looks like..
Name Sex
Anna F
Michael M
David M
Sarah F
I read this in using
sourcenames = read.csv("babynames.txt", header=F, sep=",")
I ultimately want to end up with my result looking like..
Name Last Initial Sex
Michael l M
Sarah h F
I've managed to split the name into separate characters..
sourceout = strsplit(as.character(sourcenames$Name),'')
But now where I'm stuck is how to get the last letter, so in the case of Michael, how to get 'l'. I thought tail() might work but its returning the last few records, not the last character in each Name element.
Any help or advice is greatly appreciated.
Thanks :)
To get the last n characters from a string, we can use the stri_sub() function from a stringi package in R.
To get access to the individual characters in an R string, you need to use the substr function: str = 'string' substr(str, 1, 1) # This evaluates to 's'. For the same reason, you can't use length to find the number of characters in a string. You have to use nchar instead.
Removing the last n characters To remove the string's last n characters, we can use the built-in substring() function in R.
For your strsplit
method to work, you can use tail
with sapply
df$LastInit <- sapply(strsplit(as.character(df$Name), ""), tail, 1)
df
# Name Sex LastInit
# 1 Anna F a
# 2 Michael M l
# 3 David M d
# 4 Sarah F h
Alternatively, you can use substring
with(df, substring(Name, nchar(Name)))
# [1] "a" "l" "d" "h"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With