I have a dataframe with a few columns, one of those columns is ranks, an integer between 1 and 20. I want to create another column that contains a bin value like "1-4", "5-10", "11-15", "16-20". What is the most effective way to do this? the data frame that I have looks like this(.csv format): <pre class="prettyprint"><code>rank,name,info 1,steve,red 3,joe,blue 6,john,green 3,liz,yellow 15,jon,pink </code></pre> and I want to add another column to the dataframe, so it would be like this: <pre class="prettyprint"><code>rank,name,info,binValue 1,steve,red,"1-4" 3,joe,blue,"1-4" 6,john,green, "5-10" 3,liz,yellow,"1-4" 15,jon,pink,"11-15" </code></pre> The way I am doing it now is not working, as I would like to keep the data.frame intact, and just add another column if the value of df$ranked is within a given range. thank you.

See <code>?cut</code> and specify <code>breaks</code> (and maybe <code>labels</code>). <pre class="prettyprint"><code>x$bins <- cut(x$rank, breaks=c(0,4,10,15), labels=c("1-4","5-10","10-15")) x # rank name info bins # 1 1 steve red 1-4 # 2 3 joe blue 1-4 # 3 6 john green 5-10 # 4 3 liz yellow 1-4 # 5 15 jon pink 10-15 </code></pre>

Add column which contains binned values of an integer column

Tags:

r

r-faq

I have a dataframe with a few columns, one of those columns is ranks, an integer between 1 and 20. I want to create another column that contains a bin value like "1-4", "5-10", "11-15", "16-20".

What is the most effective way to do this?

the data frame that I have looks like this(.csv format):

rank,name,info 1,steve,red 3,joe,blue 6,john,green 3,liz,yellow 15,jon,pink

and I want to add another column to the dataframe, so it would be like this:

rank,name,info,binValue 1,steve,red,"1-4" 3,joe,blue,"1-4" 6,john,green, "5-10" 3,liz,yellow,"1-4" 15,jon,pink,"11-15"

The way I am doing it now is not working, as I would like to keep the data.frame intact, and just add another column if the value of df$ranked is within a given range. thank you.

312

asked Apr 06 '11 17:04

wespiserA

2 Answers

See ?cut and specify breaks (and maybe labels).

x$bins <- cut(x$rank, breaks=c(0,4,10,15), labels=c("1-4","5-10","10-15")) x #   rank  name   info  bins # 1    1 steve    red   1-4 # 2    3   joe   blue   1-4 # 3    6  john  green  5-10 # 4    3   liz yellow   1-4 # 5   15   jon   pink 10-15

198

answered Sep 21 '22 17:09

Joshua Ulrich

dat <- "rank,name,info 1,steve,red 3,joe,blue 6,john,green 3,liz,yellow 15,jon,pink"  x <- read.table(textConnection(dat), header=TRUE, sep=",", stringsAsFactors=FALSE) x$bins <- cut(x$rank, breaks=seq(0, 20, 5), labels=c("1-5", "6-10", "11-15", "16-20")) x    rank  name   info  bins 1    1 steve    red   1-5 2    3   joe   blue   1-5 3    6  john  green  6-10 4    3   liz yellow   1-5 5   15   jon   pink 11-15

answered Sep 25 '22 17:09

Andrie

Related questions
                            
                                Select groups which have at least one of a certain value
                            
                                Equivalent of matlab 'ans' in R [duplicate]
                            
                                How to perform natural (lexicographic) sorting in R? [duplicate]
                            
                                Fitting data to distributions?
                            
                                Keep value if not in case_when statement
                            
                                Is it possible to have sortable (Interactive) table in rMarkdown?
                            
                                Linear model function lm() error: NA/NaN/Inf in foreign function call (arg 1)
                            
                                Working with neuralnet in R for the first time: get "requires numeric/complex matrix/vector arguments"
                            
                                Convert a matrix in R into a upper triangular/lower triangular matrix with those corresponding entries
                            
                                Get Selected Row From DataTable in Shiny App
                            
                                Why is mean() so slow?
                            
                                How to have NA's displayed first using arrange()
                            
                                Time difference in years with lubridate?
                            
                                Saving plot to tiff, with high resolution for publication (in R)
                            
                                How do I print a hexadecimal number with leading 0 to have width 2 using sprintf?
                            
                                How to use or/and in dplyr to subset a data.frame
                            
                                Split the title onto multiple lines?
                            
                                Does R have something equivalent to reduce() in Python?
                            
                                Using R and plot.ly - how do I script saving my output as a webpage
                            
                                Create pdf with tooltips in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With