How to trim white spaces when trimws is not working?

Tags:

2 Answers

The character with ASCII code 160 is called a "non-breaking space." One can read about it in Wikipedia:

https://en.wikipedia.org/wiki/Non-breaking_space

The trimws() function does not include it in the list of characters that are removed by the function:

x <- intToUtf8(c(160,49,49,57,57,46,48,48))
x
#[1] " 1199.00"

trimws(x)
#[1] " 1199.00"

One way to get rid of it is by using str_trim() function from the stringr library:

library(stringr)
y <- str_trim(x)
trimws(y)
[1] "1199.00"

Another way is by applying iconv() function first:

y <- iconv(x, from = 'UTF-8', to = 'ASCII//TRANSLIT')
trimws(y)
#[1] "1199.00"

UPDATE To explain why trimws() does not remove the "invisible" character described above and stringr::str_trim() does.

Here is what we read from trimws() help:

For portability, ‘whitespace’ is taken as the character class [ \t\r\n] (space, horizontal tab, line feed, carriage return)

For stringr::str_trim() help topic itself does not specify what is considered a "white space" but if you look at the help for stri_trim_both which is called by str_trim() you will see: stri_trim_both(str, pattern = "\\P{Wspace}") Basically in this case it is using a wider range of characters that are considered as a white space.

UPDATE 2

As @H1 noted, version 3.6.0 provides an option to specify what to consider a whitespace character:

Internally, 'sub(re, "", *, perl = TRUE)', i.e., PCRE library regular expressions are used. For portability, the default 'whitespace' is the character class '[ \t\r\n]' (space, horizontal tab, carriage return, newline). Alternatively, '[\h\v]' is a good (PCRE) generalization to match all Unicode horizontal and vertical white space characters, see also <URL: https://www.pcre.org>.

So if you are using version 3.6.0 or later you can simply do:

> trimws(x,whitespace = "[\\h\\v]")
#[1] "1199.00"

answered Sep 21 '22 03:09

Katia

From R version 3.6.0 trimws() has an argument allowing you to define what is considered whitespace which in this case is a no break space.

trimws(x, whitespace = "\u00A0|\\s")
[1] "1199.00"

answered Sep 22 '22 03:09

Ritchie Sacramento

Related questions
                            
                                Replace NA with 0, only in numeric columns in data.table
                            
                                Passing a column name to R tidyr spread
                            
                                Counting occurrences without modifying the original order
                            
                                stringr equivalent to grep
                            
                                Change size of hover text in Plotly
                            
                                filter duplicates from a data frame in r [duplicate]
                            
                                Removing latitude and longitude labels in ggplot
                            
                                as.Date produces unexpected result in a sequence of week-based dates
                            
                                Spread with duplicate identifiers (using tidyverse and %>%) [duplicate]
                            
                                `purrr::map` to any type
                            
                                Remove rows with the same value across all columns
                            
                                Remove specific last character from string
                            
                                Error with H2O in R - can't connect to local host
                            
                                How to Transpose (t) in the Tidyverse Using Tidyr
                            
                                R: Remove duplicates from a dataframe based on categories in a column
                            
                                Show content for menuItem when menuSubItems exist in Shiny Dashboard
                            
                                Reducing spacing between lines when using atop
                            
                                How to include NA data in a table
                            
                                Dynamic variable names in R regressions
                            
                                How to recode a range of rows in between two specific values

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to trim white spaces when trimws is not working?

Tags:

r

Omar Gonzales

People also ask

2 Answers

Katia

Ritchie Sacramento

Recent Activity

Donate For Us