Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking if a string contains all blankspace in R

Im looking to see if a string contains only blank space. The string could be

"  "

or

"           "

or

"              " 

etc...

I want to do this so I can change values in a data frame to NA, because my goal is to fix/clean messed up data.

Thank you

like image 404
Kasarrah Avatar asked Mar 01 '16 14:03

Kasarrah


2 Answers

You can try with grepl:

grepl("^\\s*$", your_string)

"^\\s*$" asks for 0 or more (*) spaces (\\s) between beginning (^) and end ($) of string.

Examples

grepl("^\\s*$", " ")
#[1] TRUE
grepl("^\\s*$", "")
#[1] TRUE
grepl("^\\s*$", "    ")
#[1] TRUE
grepl("^\\s*$", " ab")
[1] FALSE

NB: you can also just use a space instead of \\s in the regex ("^\\s*$").

like image 53
Cath Avatar answered Oct 14 '22 23:10

Cath


Without regex, you could use

which(nchar(trimws(vec))==0)

The function trimws() removes trailing and leading whitespace characters from a string. Hence, if after the use of trimws the length of the string (determined by nchar()) is not zero, the string contains at least one non-whitespace character.

Example:

vec <- c(" ", "", "   "," a  ", "             ", "b")
which(nchar(trimws(vec))==0)
#[1] 1 2 3 5

The entries 1, 2, 3, and 5 of the vector vec are either empty or contain only whitespace characters.


As suggested by Richard Scriven, the same result can be obtained without calling nchar(), by simply using trimws(vec)=="" (or which(trimws(vec)==""), depending on the desired output: the former results in a vector of booleans, the latter in the index numbers of the blank/empty entries).

like image 34
RHertel Avatar answered Oct 14 '22 21:10

RHertel