How to read csv file in R where some values contain the percent symbol (%)

Tags:

r

csv

Is there a clean/automatic way to convert CSV values formatted with as percents (with trailing % symbol) in R?

Here is some example data:

actual,simulated,percent error
2.1496,8.6066,-300%
0.9170,8.0266,-775%
7.9406,0.2152,97%
4.9637,3.5237,29%

Which can be read using:

junk = read.csv("Example.csv")

But all of the % columns are read as strings and converted to factors:

> str(junk)
 'data.frame':  4 obs. of  3 variables:
 $ actual       : num  2.15 0.917 7.941 4.964
 $ simulated    : num  8.607 8.027 0.215 3.524
 $ percent.error: Factor w/ 4 levels "-300%","-775%",..: 1 2 4 3

but I would like them to be numeric values.

Is there an additional parameter for read.csv? Is there a way to easily post process the needed columns to convert to numeric values? Other solutions?

Note: of course in this example I could simply recompute the values, but in my real application with a larger data file this is not practical.

630

asked Jan 02 '14 22:01

Bryan P

1 Answers

There is no "percentage" type in R. So you need to do some post-processing:

DF <- read.table(text="actual,simulated,percent error
2.1496,8.6066,-300%
0.9170,8.0266,-775%
7.9406,0.2152,97%
4.9637,3.5237,29%", sep=",", header=TRUE)

DF[,3] <- as.numeric(gsub("%", "",DF[,3]))/100

#  actual simulated percent.error
#1 2.1496    8.6066         -3.00
#2 0.9170    8.0266         -7.75
#3 7.9406    0.2152          0.97
#4 4.9637    3.5237          0.29

128

answered Sep 18 '22 10:09

Roland

Related questions
                            
                                Reshape in the middle
                            
                                comparing values in a row
                            
                                Jensen Shannon divergence in R
                            
                                Function within Function in R
                            
                                Hmisc::latex not printing caption w/ tabular object
                            
                                Using predict to find values of non-linear model
                            
                                "unpacking" a factor list from a data.frame
                            
                                Difference between R-Cran and R-Forge project? [closed]
                            
                                How to replace the text inside an XML element in R?
                            
                                extract last row for each subject from a data frame
                            
                                Add a countdown column to data.table containing rows until a special row encountered
                            
                                How do I count the occurrences of a factor in several columns, grouping by one column?
                            
                                Check if a list is nested or not
                            
                                R: ggplot better gradient color
                            
                                geom_ribbon overlay when x-axis is discrete
                            
                                Splitting a Long String into smaller strings
                            
                                get rows of unique values by group
                            
                                Do not print NA when printing data frame
                            
                                Updating individual values (not rows) in an R data.frame
                            
                                Nonlinear multiple regression in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With