How would I use substring to only use the first 3 digits of the postal code in the data sheet?
YEAR PERSON POSTALCODE STORE_ID
2012 245345 M2H 2I4 20001319
2012 234324 L6N 3R5 20001319
2012 556464 L6N 4T5 20001319
This is a piece of code I tried, however my data sheet appeared with 0 objects after I added the substring part of the code (I'm guessing I made an extremely dumb mistake):
combined <- merge(df1, df2, by.y="PERSON")
store1 <- combined[combined$STORE_ID == 20001319 && substr(combined$POSTALCODE, 1, 3), ]
substring of a vector or column in R can be extracted using substr() function. To extract the substring of the column in R we use functions like substr() and substring(). substring of the vector in R using substr() function.
To extract columns with a particular string in column name of an R data frame, we can use grepl function for column names and then subset the data frame with single square brackets.
Find substring in R using substr() method in R Programming is used to find the sub-string from starting index to the ending index values in a string. Return: Returns the sub string from a given string using indexes.
substr(combined$POSTALCODE, 1, 3)
gives you
# [1] "M2H" "L6N" "L6N"
So one possible selection could be
combined[combined$STORE_ID == 20001319 & substr(combined$POSTALCODE, 1, 3) == "M2H", ]
which gives you the subset
# YEAR PERSON POSTALCODE STORE_ID
# 1 2012 245345 M2H 2I4 20001319
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With