UPDATE: dplyr has been updated since this question was asked and now performs as the OP wanted I´m trying to get the second to the seventh line in a <code>data.frame</code> using <code>dplyr</code>. I´m doing this: <pre class="prettyprint"><code>require(dplyr) df <- data.frame(id = 1:10, var = runif(10)) df <- df %>% filter(row_number() <= 7, row_number() >= 2) </code></pre> But this throws an error. <pre class="prettyprint"><code>Error in rank(x, ties.method = "first") : argument "x" is missing, with no default </code></pre> I know i could easily make: <pre class="prettyprint"><code>df <- df %>% mutate(rn = row_number()) %>% filter(rn <= 7, rn >= 2) </code></pre> But I would like to understand why my first try is not working.

Actually dplyr's <code>slice</code> function is made for this kind of subsetting: <pre class="prettyprint"><code>df %>% slice(2:7) </code></pre> (I'm a little late to the party but thought I'd add this for future readers)

The <code>row_number()</code> function does not simply return the row number of each element and so can't be used like you want: • ‘row_number’: equivalent to ‘rank(ties.method = "first")’ You're not actually saying what you want the <code>row_number</code> of. In your case: <pre class="prettyprint"><code>df %>% filter(row_number(id) <= 7, row_number(id) >= 2) </code></pre> works because <code>id</code> is sorted and so <code>row_number(id)</code> is <code>1:10</code>. I don't know what <code>row_number()</code> evaluates to in this context, but when called a second time <code>dplyr</code> has run out of things to feed it and you get the equivalent of: <pre class="prettyprint"><code>> row_number() Error in rank(x, ties.method = "first") : argument "x" is missing, with no default </code></pre> That's your error right there. Anyway, that's not the way to select rows. You simply need to subscript <code>df[2:7,]</code>, or if you insist on pipes everywhere: <pre class="prettyprint"><code>> df %>% "["(.,2:7,) id var 2 2 0.52352994 3 3 0.02994982 4 4 0.90074801 5 5 0.68935493 6 6 0.57012344 7 7 0.01489950 </code></pre>

filtering data.frame based on row_number()

Tags:

r

dplyr

UPDATE: dplyr has been updated since this question was asked and now performs as the OP wanted

I´m trying to get the second to the seventh line in a data.frame using dplyr.

I´m doing this:

require(dplyr) df <- data.frame(id = 1:10, var = runif(10)) df <- df %>% filter(row_number() <= 7, row_number() >= 2)

But this throws an error.

Error in rank(x, ties.method = "first") :    argument "x" is missing, with no default

I know i could easily make:

df <- df %>% mutate(rn = row_number()) %>% filter(rn <= 7, rn >= 2)

But I would like to understand why my first try is not working.

329

asked Sep 23 '14 11:09

Daniel Falbel

2 Answers

Actually dplyr's slice function is made for this kind of subsetting:

df %>% slice(2:7)

(I'm a little late to the party but thought I'd add this for future readers)

160

answered Oct 22 '22 06:10

talat

The row_number() function does not simply return the row number of each element and so can't be used like you want:

• ‘row_number’: equivalent to ‘rank(ties.method = "first")’

You're not actually saying what you want the row_number of. In your case:

df %>% filter(row_number(id) <= 7, row_number(id) >= 2)

works because id is sorted and so row_number(id) is 1:10. I don't know what row_number() evaluates to in this context, but when called a second time dplyr has run out of things to feed it and you get the equivalent of:

> row_number() Error in rank(x, ties.method = "first") :    argument "x" is missing, with no default

That's your error right there.

Anyway, that's not the way to select rows.

You simply need to subscript df[2:7,], or if you insist on pipes everywhere:

> df %>% "["(.,2:7,)   id        var 2  2 0.52352994 3  3 0.02994982 4  4 0.90074801 5  5 0.68935493 6  6 0.57012344 7  7 0.01489950

answered Oct 22 '22 07:10

Spacedman

Related questions
                            
                                How to get help in R?
                            
                                How to call a function using the character string of the function name in R?
                            
                                Getting frequency values from histogram in R
                            
                                How to remove rows with inf from a dataframe in R
                            
                                Extracting text data from PDF files
                            
                                Examples of the perils of globals in R and Stata
                            
                                Pretty ticks for log normal scale using ggplot2 (dynamic not manual)
                            
                                Vectorized IF statement in R?
                            
                                Hiding NA's when printing a dataframe in knitr
                            
                                Creating a sequential list of letters with R
                            
                                Calling R Function from C++
                            
                                Adding a company Logo to ShinyDashboard header
                            
                                How do I read a Parquet in R and convert it to an R DataFrame?
                            
                                Calculate cumsum() while ignoring NA values
                            
                                Random sample of character vector, without elements prefixing one another
                            
                                Create end of the month date from a date variable
                            
                                jupyter-client has to be installed but “jupyter kernelspec --version” exited with code 127
                            
                                dplyr: put count occurrences into new variable [duplicate]
                            
                                rename the columns name after cbind the data
                            
                                How I can select rows from a dataframe that do not match?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With