I'm putting together some tables that look almost the same, except that some characters appear accented in some and non-accented in others. For instance, "André" sometimes reads "Andre", "Flávio" and "Flavio", etc. I need to consider all variations as equal, but unique() considers them as different. I thought about changing all accented to non accented, and then using unique(), but I thought that maybe there is another, faster option.
Later I need to make the same accent-insensitive comparison using == so I'm thinking about removing all accents from a copy of each table, and do the comparison on the copies. Please tell me if there's a different, better approach.
The approach of removing accents prior to comparison seems appropriate for your purposes. Note that such a facility exists in iconv
with the TRANSLIT
flag
iconv(c("André","Flávio"),to='ASCII//TRANSLIT')
#> [1] "Andre" "Flavio"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With