Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove the rows that have non-numeric characters in one column in R

Tags:

In the data frame, column A is expected to be a numeric vector.

So if an entry of the column has any non-numeric characters, I would remove the corresponding entire row.

Does anyone have a solution? Thanks!

like image 680
user3915459 Avatar asked Aug 13 '14 00:08

user3915459


People also ask

How do I remove rows without values in R?

To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).

How do I remove specific data from a column in R?

To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.

How do I remove a row that contains a specific value in R?

First of all, create a data frame. Then, use single square subsetting with apply function to remove rows that contains a specific number.

How do I remove non numeric characters from a string?

In order to remove all non-numeric characters from a string, replace() function is used. replace() Function: This function searches a string for a specific value, or a RegExp, and returns a new string where the replacement is done.


2 Answers

When you import data to a data.frame, it generally gets converted to a factor if the entire column is not numeric. With that in mind, you usually have to convert to character and then to numeric.

dat <- data.frame(A=c(letters[1:5],1:5))

str(dat)
'data.frame':   10 obs. of  1 variable:
 $ A: Factor w/ 10 levels "1","2","3","4",..: 6 7 8 9 10 1 2 3 4 5

as.numeric(as.character(dat$A))
 [1] NA NA NA NA NA  1  2  3  4  5
Warning message:
NAs introduced by coercion  

Notice that it converts characters to NA. Combining this:

dat <- dat[!is.na(as.numeric(as.character(dat$A))),]

In words, the rows of dat that are not NA after conversion from factor to numeric.

Second Issue:

> dat <- data.frame(A=c(letters[1:5],1:5))
> dat <- dat[!is.na(as.numeric(as.character(dat$A))),]
Warning message:
In `[.data.frame`(dat, !is.na(as.numeric(as.character(dat$A))),  :
  NAs introduced by coercion
> dat <- dat[!is.na(as.numeric(as.character(dat$A))),]
Error in dat$A : $ operator is invalid for atomic vectors
like image 196
Brandon Bertelsen Avatar answered Sep 18 '22 14:09

Brandon Bertelsen


Or using @Brandon Bertelsen's example data

dat1 <- transform(dat[grep("^\\d+$", dat$A),,drop=F], A= as.numeric(as.character(A)))
dat1
#   A
#6  1
#7  2
#8  3
#9  4
#10 5

 str(dat1)
#'data.frame':  5 obs. of  1 variable:
#$ A: num  1 2 3 4 5
like image 27
akrun Avatar answered Sep 18 '22 14:09

akrun