I need to remove commas from a field in an R dataframe. Technically I have managed to do this, but the result seems to be neither a vector nor a matrix, and I cannot get it back into the dataframe in a usable format. So is there a way to remove the commas from a field, AND have that field remain part of the dataframe. Here is a sample of the field that needs commas removed, and the results generated by my code: <pre class="prettyprint"><code>> print(x['TOT_EMP']) TOT_EMP 1 132,588,810 2 6,542,950 3 2,278,260 4 248,760 > y [1] "c(\"132588810\" \"6542950\" \"2278260\" \"248760\...)" </code></pre> The desired result is a numeric field: <pre class="prettyprint"><code> TOT_EMP 1 132588810 2 6542950 3 2278260 4 248760 x<-read.csv("/home/mark/Desktop/national_M2013_dl.csv",header=TRUE,colClasses="character") y=(gsub(",","",x['TOT_EMP'])) print(y) </code></pre>

<code>gsub()</code> will return a character vector, not a numeric vector (which is it sounds like you want). <code>as.numeric()</code> will convert the character vector back into a numeric vector: <pre class="prettyprint"><code>> df <- data.frame(numbers = c("123,456,789", "1,234,567", "1,234", "1")) > df numbers 1 123,456,789 2 1,234,567 3 1,234 4 1 > df$numbers <- as.numeric(gsub(",","",df$numbers)) > df numbers 1 123456789 2 1234567 3 1234 4 1 </code></pre> The result is still a <code>data.frame</code>: <pre class="prettyprint"><code>> class(df) [1] "data.frame" </code></pre>

In R: remove commas from a field AND have the modified field remain part of the dataframe

Tags:

string

r

comma

I need to remove commas from a field in an R dataframe. Technically I have managed to do this, but the result seems to be neither a vector nor a matrix, and I cannot get it back into the dataframe in a usable format. So is there a way to remove the commas from a field, AND have that field remain part of the dataframe.

Here is a sample of the field that needs commas removed, and the results generated by my code:

> print(x['TOT_EMP'])
         TOT_EMP
1    132,588,810
2      6,542,950
3      2,278,260
4        248,760

> y
[1] "c(\"132588810\" \"6542950\" \"2278260\" \"248760\...)"

The desired result is a numeric field:

       TOT_EMP
1    132588810
2      6542950
3      2278260
4       248760

x<-read.csv("/home/mark/Desktop/national_M2013_dl.csv",header=TRUE,colClasses="character")
y=(gsub(",","",x['TOT_EMP']))
print(y)

835

asked Jan 24 '15 19:01

mark stevenson

1 Answers

gsub() will return a character vector, not a numeric vector (which is it sounds like you want). as.numeric() will convert the character vector back into a numeric vector:

> df <- data.frame(numbers = c("123,456,789", "1,234,567", "1,234", "1"))
> df
      numbers
1 123,456,789
2   1,234,567
3       1,234
4           1
> df$numbers <- as.numeric(gsub(",","",df$numbers))
> df
    numbers
1 123456789
2   1234567
3      1234
4         1

The result is still a data.frame:

> class(df)
[1] "data.frame"

answered Sep 22 '22 12:09

Richard Border

Related questions
                            
                                ! grep in R - finding items that do not match [duplicate]
                            
                                Error: withCallingHandlers crashing R
                            
                                Difference between glmnet() and cv.glmnet() in R?
                            
                                Exporting R regression summary for publishable paper
                            
                                Assign value to group based on condition in column
                            
                                understanding ddply error message
                            
                                Package ‘neuralnet’ in R, rectified linear unit (ReLU) activation function?
                            
                                Fastest way to coerce matrix to integer matrix in R
                            
                                Separate sizes for points and lines in geom_pointrange from ggplot
                            
                                Convert HH:MM:SS AM/PM string to time
                            
                                Left-justify geom_text layer with ggplot2
                            
                                Plotting static base map underneath a sf object
                            
                                Nearest "n" rolling join in R data table
                            
                                R- converting data from fraction to decimal [duplicate]
                            
                                geom_line - different colour in the same line
                            
                                Calculate the week number (0-53) in year
                            
                                How can I reorder the x axis in a plot in R?
                            
                                In R, how to add the fitted value column to the original dataframe?
                            
                                Get 95% confidence interval with glm(..) in R
                            
                                How can I shut down Rserve gracefully?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With