Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

removing particular character in a column in r

Tags:

r

strsplit

I have a table called LOAN containing column named RATE in which the observations are given in percentage for example 14.49% how can i format the table so that all value in rate are edited and % is removed from the entries so that i can use plot function on it .I tried using strsplit.

strsplit(LOAN$RATE,"%")

but got error non character argument

like image 862
HARJOT SINGH PARMAR Avatar asked Feb 05 '13 22:02

HARJOT SINGH PARMAR


People also ask

How do I remove a specific value from a column in R?

To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.

How do I remove certain values in R?

To remove rows with an in R we can use the na. omit() and <code>drop_na()</code> (tidyr) functions.

How do I remove a specific character from a string?

Using 'str. replace() , we can replace a specific character. If we want to remove that specific character, replace that character with an empty string. The str. replace() method will replace all occurrences of the specific character mentioned.

How do I replace a character in a column in R?

Use str_replace() method from stringr package to replace part of a column string with another string in R DataFrame.


2 Answers

This can be achieved using the mutate verb from the tidyverse package. Which in my opinion is more readable. So, to exemplify this, I create a dataset called LOAN with a focus on the RATE to mimic the problem above.

library(tidyverse)
LOAN <- data.frame("SN" = 1:4, "Age" = c(21,47,68,33), 
                   "Name" = c("John", "Dora", "Ali", "Marvin"),
                   "RATE" = c('16%', "24.5%", "27.81%", "22.11%"), 
                   stringsAsFactors = FALSE)
head(LOAN)
  SN Age   Name   RATE
1  1  21   John    16%
2  2  47   Dora  24.5%
3  3  68    Ali 27.81%
4  4  33 Marvin 22.11%

In what follows, mutate allows one to alter the column content, gsub does the desired substitution (of % with "") and as.numeric() converts the RATE column to numeric value, keeping the data cleaning flow followable.

LOAN <- LOAN %>% mutate(RATE = as.numeric(gsub("%", "", RATE)))
head(LOAN)
  SN Age   Name  RATE
1  1  21   John 16.00
2  2  47   Dora 24.50
3  3  68    Ali 27.81
4  4  33 Marvin 22.11
like image 55
odunayo12 Avatar answered Oct 04 '22 12:10

odunayo12


Items that appear to be character when printed but for which R thinks otherwise are generally factor classes objects. I'm also guessing that you are not going to be happy with the list output that strsplit will return. Try:

gsub( "%", "", as.character(LOAN$RATE) n)

Factors which are appear numeric can be a source of confusion as well:

> factor("14.9%")
[1] 14.9%
Levels: 14.9%
> as.character(factor("14.9%"))
[1] "14.9%"
> gsub("%", "", as.character(factor("14.9%")) )
[1] "14.9"

This is especially confusing since print.data.frame removes the quotes:

> data.frame(z=factor("14.9%"), zz=factor(14.9))
      z   zz
1 14.9% 14.9
like image 35
IRTFM Avatar answered Oct 04 '22 11:10

IRTFM