Let' say I have two data.frames <pre class="prettyprint"><code>name_df = read.table(text = "player_name a b c d e f g", header = T) game_df = read.table(text = "game_id winner_name loser_name 1 a b 2 b a 3 a c 4 a d 5 b c 6 c d 7 d e 8 e f 9 f a 10 g f 11 g a 12 f e 13 a d", header = T) </code></pre> <code>name_df</code> contains a unique list of all the <code>winner_name</code> or <code>loser_name</code> values in <code>game_df</code>. I want to create a new data.frame that has, for each person in the <code>name_df</code> a row if a given name (e.g. <code>a</code>) appears in either the <code>winner_name</code> or <code>loser_name</code> column So I essentially want to merge <code>game_df</code> with <code>name_df</code>, but the key column (<code>name</code>) can appear in either <code>winner_name</code> or <code>loser_name</code>. So, for just <code>a</code> and <code>b</code> the final output would look something like: <pre class="prettyprint"><code>final_df = read.table(text = "player_name game_id winner_name loser_name a 1 a b a 2 b a a 3 a c a 4 a d a 9 f a a 11 g a a 13 a d b 1 a b b 2 b a b 5 b c", header = T) </code></pre>

We can loop over the elements in 'name_df' for 'player_name', <code>filter</code> the rows from 'game_df' for either the 'winner_name' or 'loser_name' <pre class="prettyprint"><code>library(dplyr) library(purrr) map_dfr(setNames(name_df$player_name, name_df$player_name), ~ game_df %>% filter(winner_name %in% .x|loser_name %in% .x), .id = 'player_name') </code></pre> <hr> Or if there are many columns, use <code>if_any</code> <pre class="prettyprint"><code>map_dfr(setNames(name_df$player_name, name_df$player_name), ~ { nm1 <- .x game_df %>% filter(if_any(c(winner_name, loser_name), ~ . %in% nm1)) }, .id = 'player_name') </code></pre>

Dedicated to our teacher and mentor dear @akrun I think we can also make use of the <code>add_row()</code> function you first taught me the other day. Unbelievable!!! <pre class="prettyprint"><code>library(dplyr) library(purrr) library(tibble) game_df %>% rowwise() %>% mutate(player_name = winner_name) %>% group_split(game_id) %>% map_dfr(~ add_row(.x, game_id = .x$game_id, winner_name = .x$winner_name, loser_name = .x$loser_name, player_name = .x$loser_name)) %>% arrange(player_name) %>% relocate(player_name) # A tibble: 26 x 4 player_name game_id winner_name loser_name <chr> <int> <chr> <chr> 1 a 1 a b 2 a 2 b a 3 a 3 a c 4 a 4 a d 5 a 9 f a 6 a 11 g a 7 a 13 a d 8 b 1 a b 9 b 2 b a 10 b 5 b c # ... with 16 more rows </code></pre>

Without using <code>purrr</code>. I think this is appropriate use case of <code>tidyr::unite</code> with argument <code>remove = F</code> where we can first unite the winners' and losers' names and then use <code>tidyr::separate_rows</code> to split new column into rows. <pre class="prettyprint"><code>library(tidyr) library(dplyr) game_df %>% unite(Player_name, winner_name, loser_name, remove = F, sep = ', ') %>% separate_rows(Player_name) %>% relocate(Player_name) %>% arrange(Player_name) # A tibble: 26 x 4 Player_name game_id winner_name loser_name <chr> <int> <chr> <chr> 1 a 1 a b 2 a 2 b a 3 a 3 a c 4 a 4 a d 5 a 9 f a 6 a 11 g a 7 a 13 a d 8 b 1 a b 9 b 2 b a 10 b 5 b c # ... with 16 more rows </code></pre>

Extract rows where value appears in any of multiple columns

Let' say I have two data.frames

name_df = read.table(text = "player_name
a
b
c
d
e
f
g", header = T)

game_df = read.table(text = "game_id winner_name loser_name
1 a b
2 b a
3 a c
4 a d
5 b c
6 c d
7 d e
8 e f
9 f a
10 g f
11 g a
12 f e
13 a d", header = T)

name_df contains a unique list of all the winner_name or loser_name values in game_df. I want to create a new data.frame that has, for each person in the name_df a row if a given name (e.g. a) appears in either the winner_name or loser_name column

So I essentially want to merge game_df with name_df, but the key column (name) can appear in either winner_name or loser_name.

So, for just a and b the final output would look something like:

final_df = read.table(text = "player_name game_id winner_name loser_name
a 1 a b
a 2 b a
a 3 a c
a 4 a d
a 9 f a
a 11 g a
a 13 a d
b 1 a b
b 2 b a
b 5 b c", header = T)

How to find rows with specific values in R?

You can use the following basic syntax to find the rows of a data frame in R in which a certain value appears in any of the columns: library(dplyr) df %>% filter_all(any_vars(. %in% c('value1', 'value2', ...)))

Which of the following is an arrangement of data in rows and columns?

Tabulation is the planned or structured statistical data arrangement in rows or columns.

How do I extract all rows from a range in Excel?

Extract all rows from a range that meet criteria in one column [Array formula] The array formula in cell B20 extracts records where column E equals either "South" or "East". To enter an array formula, type the formula in a cell then press and hold CTRL + SHIFT simultaneously, now press Enter once.

How to select all rows that contain the value 25 in Dataframe?

The following syntax shows how to select all rows of the DataFrame that contain the value 25 in any of the columns: df [df.isin( [25]).any(axis=1)] points assists rebounds 0 25 5 11 The following syntax shows how to select all rows of the DataFrame that contain the values 25, 9, or 6 in any of the columns:

How to extract Records where column E equals E in Excel?

The array formula in cell B20 extracts records where column E equals either "South" or "East". To enter an array formula, type the formula in a cell then press and hold CTRL + SHIFT simultaneously, now press Enter once. Release all keys.

How to select all rows that contain the character G in Dataframe?

The following syntax shows how to select all rows of the DataFrame that contain the character G in any of the columns: df [df.isin( ['G']).any(axis=1)] points assists position 0 25 5 G 1 12 7 G

We can loop over the elements in 'name_df' for 'player_name', filter the rows from 'game_df' for either the 'winner_name' or 'loser_name'

library(dplyr)
library(purrr)
map_dfr(setNames(name_df$player_name, name_df$player_name), 
   ~ game_df %>%
        filter(winner_name %in% .x|loser_name %in% .x), .id = 'player_name')

Or if there are many columns, use if_any

map_dfr(setNames(name_df$player_name, name_df$player_name), 
  ~ {
     nm1 <- .x
     game_df %>%
       filter(if_any(c(winner_name, loser_name), ~ . %in%  nm1))
      }, .id = 'player_name')

Dedicated to our teacher and mentor dear @akrun

I think we can also make use of the add_row() function you first taught me the other day. Unbelievable!!!

library(dplyr)
library(purrr)
library(tibble)

game_df %>%
  rowwise() %>%
  mutate(player_name = winner_name) %>%
  group_split(game_id) %>%
  map_dfr(~ add_row(.x, game_id = .x$game_id, winner_name = .x$winner_name, 
                    loser_name = .x$loser_name, player_name = .x$loser_name)) %>%
  arrange(player_name) %>%
  relocate(player_name)


# A tibble: 26 x 4
   player_name game_id winner_name loser_name
   <chr>         <int> <chr>       <chr>     
 1 a                 1 a           b         
 2 a                 2 b           a         
 3 a                 3 a           c         
 4 a                 4 a           d         
 5 a                 9 f           a         
 6 a                11 g           a         
 7 a                13 a           d         
 8 b                 1 a           b         
 9 b                 2 b           a         
10 b                 5 b           c         
# ... with 16 more rows

This can be directly expressed in SQL:

library(sqldf)

sqldf("select * 
  from name_df 
  left join game_df on winner_name = player_name or loser_name = player_name")

Without using purrr. I think this is appropriate use case of tidyr::unite with argument remove = F where we can first unite the winners' and losers' names and then use tidyr::separate_rows to split new column into rows.

library(tidyr)
library(dplyr)

game_df %>% unite(Player_name, winner_name, loser_name, remove = F, sep = ', ') %>%
  separate_rows(Player_name) %>%
  relocate(Player_name) %>%
  arrange(Player_name)

# A tibble: 26 x 4
   Player_name game_id winner_name loser_name
   <chr>         <int> <chr>       <chr>     
 1 a                 1 a           b         
 2 a                 2 b           a         
 3 a                 3 a           c         
 4 a                 4 a           d         
 5 a                 9 f           a         
 6 a                11 g           a         
 7 a                13 a           d         
 8 b                 1 a           b         
 9 b                 2 b           a         
10 b                 5 b           c         
# ... with 16 more rows

A Base R approach :

result <- do.call(rbind, lapply(name_df$player_name, function(x) 
                   cbind(plaername = x, 
                   subset(game_df, winner_name == x | loser_name == x))))

rownames(result) <- NULL

result
#   playername game_id winner_name loser_name
#1           a       1           a          b
#2           a       2           b          a
#3           a       3           a          c
#4           a       4           a          d
#5           a       9           f          a
#6           a      11           g          a
#7           a      13           a          d
#8           b       1           a          b
#...
#...

Extract rows where value appears in any of multiple columns

Tags:

r

data.table

dplyr

Parseltongue

People also ask

Video Answer

5 Answers

akrun

Anoushiravan R

G. Grothendieck

AnilGoyal

Ronak Shah

Recent Activity

Donate For Us

Extract rows where value appears in any of multiple columns

Tags:

r

data.table

dplyr

Parseltongue

People also ask

Video Answer

5 Answers

akrun

Anoushiravan R

G. Grothendieck

AnilGoyal

Ronak Shah

Related questions

Recent Activity

Donate For Us