I have this data frame: <pre class="prettyprint"><code> Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6 1: 1 0 0 0 0 0 0 4 4 4 4 5 5 2: 2 0 0 0 0 0 0 4 4 4 4 4 4 3: 3 0 0 0 0 0 0 5 5 5 5 5 5 4: 4 0 0 0 0 0 0 4 5 5 5 4 4 5: 5 0 0 0 0 0 0 5 4 4 4 4 4 6: 6 0 0 0 0 0 0 5 5 5 5 4 4 </code></pre> I want to modify columns <code>1</code> through <code>6</code> such that each column counts the occurrences of that value in the the right columns (<code>NP1</code> - <code>NP6</code>). That is, the <code>4</code> column should count the number of times <code>4</code> occurs. I wish to repeat this process with every number. The number that can take values between <code>0</code> and <code>5</code>. The final result should be like this: <pre class="prettyprint"><code>head(t2 %>% select(1, 2, 3, 4, 5, 6, 7, NP1, NP2, NP3, NP4, NP5, NP6)) Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6 1: 1 0 0 0 4 2 0 4 4 4 4 5 5 2: 2 0 0 0 6 0 0 4 4 4 4 4 4 3: 3 0 0 0 0 6 0 5 5 5 5 5 5 4: 4 0 0 0 3 3 0 4 5 5 5 4 4 5: 5 0 0 0 5 1 0 5 4 4 4 4 4 6: 6 0 0 0 2 4 0 5 5 5 5 4 4 </code></pre> I have tried using the package <code>data.table</code>, I have done the following: <pre class="prettyprint"><code> t2[NP1 == 4]$`4` <- t2[NP1 == 4]$`4` + 1 </code></pre> But I had the following error: <blockquote> Error in <code>[<-.data.table</code>(<code>*tmp*</code>, NP1 == 4, value = c(1, 1, 1, 1)) : Can't assign to the same column twice in the same query (duplicates detected). </blockquote> So I have 2 questions: <ul> <li>Why do I get this error?</li> <li>Is there an easier, more intuitive way to do it?</li> </ul>

One option using <code>dplyr</code> could be (data imported with corrected column names): <pre class="prettyprint"><code>df %>% mutate(across(X1:X6, ~ rowSums(across(NP1:NP6) == as.numeric(sub("\\D+", "", cur_column()))))) Generacion X1 X2 X3 X4 X5 X6 NP1 NP2 NP3 NP4 NP5 NP6 1: 1 0 0 0 4 2 0 4 4 4 4 5 5 2: 2 0 0 0 6 0 0 4 4 4 4 4 4 3: 3 0 0 0 0 6 0 5 5 5 5 5 5 4: 4 0 0 0 3 3 0 4 5 5 5 4 4 5: 5 0 0 0 5 1 0 5 4 4 4 4 4 6: 6 0 0 0 2 4 0 5 5 5 5 4 4 </code></pre> If you want to use column names containing only numbers: <pre class="prettyprint"><code>df %>% mutate(across(`1`:`6`, ~ rowSums(across(NP1:NP6) == as.numeric(cur_column())))) Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6 1 1 0 0 0 4 2 0 4 4 4 4 5 5 2 2 0 0 0 6 0 0 4 4 4 4 4 4 3 3 0 0 0 0 6 0 5 5 5 5 5 5 4 4 0 0 0 3 3 0 4 5 5 5 4 4 5 5 0 0 0 5 1 0 5 4 4 4 4 4 6 6 0 0 0 2 4 0 5 5 5 5 4 4 </code></pre>

A <code>tidyverse</code> solution: <pre class="prettyprint"><code>library(dplyr) library(tidyr) df %>% pivot_longer(starts_with("NP")) %>% count(Generacion, value)%>% rbind(expand.grid(Generacion = 1:nrow(df), value = 1:6, n = 0)) %>% group_by(Generacion, value) %>% summarise(n = sum(n))%>% pivot_wider(id_cols = Generacion, names_from = value, values_from = n) %>% bind_cols(df %>% select(NP1:NP6)) # A tibble: 6 x 13 # Groups: Generacion [6] Generacion `1` `2` `3` `4` `5` `6` NP1 NP2 NP3 NP4 NP5 NP6 <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int> <int> 1 1 0 0 0 4 2 0 4 4 4 4 5 5 2 2 0 0 0 6 0 0 4 4 4 4 4 4 3 3 0 0 0 0 6 0 5 5 5 5 5 5 4 4 0 0 0 3 3 0 4 5 5 5 4 4 5 5 0 0 0 5 1 0 5 4 4 4 4 4 6 6 0 0 0 2 4 0 5 5 5 5 4 4 </code></pre>

Count how many times a value appears and adding the result to a column

Q: How do you count how many times something shows up in a column?

Use the COUNTIF function to count how many times a particular value appears in a range of cells.

Q: How do you count the occurrence of a value in a column?

You can use the =UNIQUE() and =COUNTIF() functions to count the number of occurrences of different values in a column in Excel.

Q: How do you count occurrences of items in a list excel?

Use the =Countif function to count the number of times each unique entry appears in the original list.

Tags:

dataframe

r

data.table

I have this data frame:

   Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6
1:          1 0 0 0 0 0 0   4   4   4   4   5   5
2:          2 0 0 0 0 0 0   4   4   4   4   4   4
3:          3 0 0 0 0 0 0   5   5   5   5   5   5
4:          4 0 0 0 0 0 0   4   5   5   5   4   4
5:          5 0 0 0 0 0 0   5   4   4   4   4   4
6:          6 0 0 0 0 0 0   5   5   5   5   4   4

I want to modify columns 1 through 6 such that each column counts the occurrences of that value in the the right columns (NP1 - NP6). That is, the 4 column should count the number of times 4 occurs. I wish to repeat this process with every number. The number that can take values between 0 and 5. The final result should be like this:

head(t2 %>% select(1, 2, 3, 4, 5, 6, 7, NP1, NP2, NP3, NP4, NP5, NP6))
   Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6
1:          1 0 0 0 4 2 0   4   4   4   4   5   5
2:          2 0 0 0 6 0 0   4   4   4   4   4   4
3:          3 0 0 0 0 6 0   5   5   5   5   5   5
4:          4 0 0 0 3 3 0   4   5   5   5   4   4
5:          5 0 0 0 5 1 0   5   4   4   4   4   4
6:          6 0 0 0 2 4 0   5   5   5   5   4   4

I have tried using the package data.table, I have done the following:

 t2[NP1 == 4]$`4` <- t2[NP1 == 4]$`4` + 1

But I had the following error:

Error in [<-.data.table(*tmp*, NP1 == 4, value = c(1, 1, 1, 1)) : Can't assign to the same column twice in the same query (duplicates detected).

So I have 2 questions:

Why do I get this error?
Is there an easier, more intuitive way to do it?

663

asked Sep 18 '21 17:09

Qiyao

4 Answers

With data.table:

library(data.table)

setDT(t2)

t2[,as.character(1:6):=lapply(1:6, function(n) rowSums(.SD==n)),.SDcols=NP1:NP6][]

#   Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6
#1:          1 0 0 0 4 2 0   4   4   4   4   5   5
#2:          2 0 0 0 6 0 0   4   4   4   4   4   4
#3:          3 0 0 0 0 6 0   5   5   5   5   5   5
#4:          4 0 0 0 3 3 0   4   5   5   5   4   4
#5:          5 0 0 0 5 1 0   5   4   4   4   4   4
#6:          6 0 0 0 2 4 0   5   5   5   5   4   4

Data:

t2 <- read.table(text=
"Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6
          1 0 0 0 0 0 0   4   4   4   4   5   5
          2 0 0 0 0 0 0   4   4   4   4   4   4
          3 0 0 0 0 0 0   5   5   5   5   5   5
          4 0 0 0 0 0 0   4   5   5   5   4   4
          5 0 0 0 0 0 0   5   4   4   4   4   4
          6 0 0 0 0 0 0   5   5   5   5   4   4",header=T)

colnames(t2) <- c('Generacion','1','2','3','4','5','6','NP1','NP2','NP3','NP4','NP5','NP6')

answered Oct 16 '22 16:10

Waldi

One option using dplyr could be (data imported with corrected column names):

df %>%
    mutate(across(X1:X6, ~ rowSums(across(NP1:NP6) == as.numeric(sub("\\D+", "", cur_column())))))

   Generacion X1 X2 X3 X4 X5 X6 NP1 NP2 NP3 NP4 NP5 NP6
1:          1  0  0  0  4  2  0   4   4   4   4   5   5
2:          2  0  0  0  6  0  0   4   4   4   4   4   4
3:          3  0  0  0  0  6  0   5   5   5   5   5   5
4:          4  0  0  0  3  3  0   4   5   5   5   4   4
5:          5  0  0  0  5  1  0   5   4   4   4   4   4
6:          6  0  0  0  2  4  0   5   5   5   5   4   4

If you want to use column names containing only numbers:

df %>%
    mutate(across(`1`:`6`, ~ rowSums(across(NP1:NP6) == as.numeric(cur_column()))))

 Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6
1          1 0 0 0 4 2 0   4   4   4   4   5   5
2          2 0 0 0 6 0 0   4   4   4   4   4   4
3          3 0 0 0 0 6 0   5   5   5   5   5   5
4          4 0 0 0 3 3 0   4   5   5   5   4   4
5          5 0 0 0 5 1 0   5   4   4   4   4   4
6          6 0 0 0 2 4 0   5   5   5   5   4   4

answered Oct 16 '22 17:10

tmfmnk

First, get the columns that must be equal to a integer and the corresponding columns with those integers as names.

This part of the code is common to both solutions below.

cols_to_add <- grep("^NP", names(t2), value = TRUE)
cols_to_change <- match(gsub("[^[:digit:]]", "", cols_to_add), names(t2)[-1])

Base R

The simplest is, in my opinion, base R function rowSums.

t2[as.character(cols_to_change)] <- lapply(cols_to_change, \(x) rowSums(t2[cols_to_add] == x))
t2
#  Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6
#1          1 0 0 0 4 2 0   4   4   4   4   5   5
#2          2 0 0 0 6 0 0   4   4   4   4   4   4
#3          3 0 0 0 0 6 0   5   5   5   5   5   5
#4          4 0 0 0 3 3 0   4   5   5   5   4   4
#5          5 0 0 0 5 1 0   5   4   4   4   4   4
#6          6 0 0 0 2 4 0   5   5   5   5   4   4

Package `data.table`.

Here is a data.table solution, also with a lapply loop.

library(data.table)

setDT(t2)
t2[, as.character(cols_to_change) := lapply(
  cols_to_change, \(x) rowSums(.SD == x)), 
  .SDcols = cols_to_add]
t2
#   Generacion 1 2 3 4 5 6 NP1 NP2 NP3 NP4 NP5 NP6
#1:          1 0 0 0 4 2 0   4   4   4   4   5   5
#2:          2 0 0 0 6 0 0   4   4   4   4   4   4
#3:          3 0 0 0 0 6 0   5   5   5   5   5   5
#4:          4 0 0 0 3 3 0   4   5   5   5   4   4
#5:          5 0 0 0 5 1 0   5   4   4   4   4   4
#6:          6 0 0 0 2 4 0   5   5   5   5   4   4

answered Oct 16 '22 16:10

Rui Barradas

A tidyverse solution:

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(starts_with("NP")) %>% 
  count(Generacion, value)%>% 
  rbind(expand.grid(Generacion = 1:nrow(df), value = 1:6, n = 0)) %>%
  group_by(Generacion, value) %>% summarise(n = sum(n))%>%
  pivot_wider(id_cols = Generacion, names_from = value, values_from = n) %>%
  bind_cols(df %>% select(NP1:NP6))

# A tibble: 6 x 13
# Groups:   Generacion [6]
  Generacion   `1`   `2`   `3`   `4`   `5`   `6`   NP1   NP2   NP3   NP4   NP5   NP6
       <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int> <int>
1          1     0     0     0     4     2     0     4     4     4     4     5     5
2          2     0     0     0     6     0     0     4     4     4     4     4     4
3          3     0     0     0     0     6     0     5     5     5     5     5     5
4          4     0     0     0     3     3     0     4     5     5     5     4     4
5          5     0     0     0     5     1     0     5     4     4     4     4     4
6          6     0     0     0     2     4     0     5     5     5     5     4     4

answered Oct 16 '22 16:10

Daniel

Related questions
                            
                                Check if a date is within an interval in R
                            
                                Check if value == integer(0) in R [duplicate]
                            
                                How to export images of diagrammer in R
                            
                                Sequence of two numbers with decreasing occurrence of one of them
                            
                                Starting Shiny app after password input (with Shinydashboard)
                            
                                ggplot2: how to color a graph by multiple variables
                            
                                Using dplyr and stringr to replace all values starts with
                            
                                R, loading rJava error
                            
                                Remove duplicated elements from list
                            
                                Renaming and Hiding an Exported Rcpp function in an R Package
                            
                                Drawing simple mediation diagram in R
                            
                                Turn off verbose messages when loading tidyverse using library() function
                            
                                How to remove rows where all columns are zero using dplyr pipe
                            
                                gam plots with ggplot
                            
                                In R, how to remove everything before the last slash
                            
                                Apply bold font on specific axis ticks
                            
                                How to merge two true/false variables in R?
                            
                                Python f-string equivalent in R?
                            
                                Looping linear regression output in a data frame in r
                            
                                R - more effective left_join [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Count how many times a value appears and adding the result to a column

Tags:

dataframe

r

data.table

Qiyao

People also ask

4 Answers

Waldi

tmfmnk

Base R

Package `data.table`.

Rui Barradas

Daniel

Recent Activity

Donate For Us

Count how many times a value appears and adding the result to a column

Tags:

dataframe

r

data.table

Qiyao

People also ask

4 Answers

Waldi

tmfmnk

Base R

Package data.table.

Rui Barradas

Daniel

Related questions

Recent Activity

Donate For Us

Package `data.table`.