I have a dataset that look like: <pre class="prettyprint"><code> Col1 Col2 1 ABC 2 DEF 3 ABC 1 DEF </code></pre> Expected output: <pre class="prettyprint"><code>Col1 Col2 1 ABC 1 DEF </code></pre> I want to extract only those IDSs from Col1 which have both values <code>ABC</code> and <code>DEF</code> in the column. I tried the <code>self-join</code> in SQL but that did not give me the expected result. <pre class="prettyprint"><code>SELECT DISTINCT Col1 FROM db A, db B WHERE A.ID <> B.ID AND A.Col2 = 'ABC' AND B.Col2 = 'DEF' GROUP BY A.Col1 </code></pre> Also, I tried to the same thing in R using the following code: <pre class="prettyprint"><code>vc <- c("ABC", "DEF") data1 <- db[db$Col2 %in% vc,] </code></pre> Again, I did not get the desired output. Thanks for all the pointers in advance.

In R, you could do <pre class="prettyprint lang-r prettyprint-override"><code>library(dplyr) df %>% group_by(Col1) %>% filter(all(vc %in% Col2)) # Col1 Col2 # <int> <fct> #1 1 ABC #2 1 DEF </code></pre> <hr> The Base R equivalent of that would be <pre class="prettyprint lang-r prettyprint-override"><code>df[as.logical(with(df, ave(Col2, Col1, FUN = function(x) all(vc %in% x)))), ] # Col1 Col2 #1 1 ABC #4 1 DEF </code></pre> We select the groups which has all of <code>vc</code> in them.

In <code>R</code>, we can also use <code>data.table</code> <pre class="prettyprint"><code>library(data.table) setDT(df)[, .SD[all(vc %in% Col2)], by = col1] </code></pre>

Extract tuples with specified common values in another column in SQL

Tags:

mysql

r

I have a dataset that look like:

 Col1    Col2    
 1        ABC 
 2        DEF
 3        ABC 
 1        DEF

Expected output:

Col1     Col2    
 1        ABC 
 1        DEF

I want to extract only those IDSs from Col1 which have both values ABC and DEF in the column.

I tried the self-join in SQL but that did not give me the expected result.

SELECT DISTINCT Col1
FROM db A, db B
WHERE A.ID <> B.ID
    AND A.Col2 = 'ABC'
    AND B.Col2 = 'DEF' 
GROUP BY A.Col1

Also, I tried to the same thing in R using the following code:

vc <- c("ABC", "DEF")
data1 <- db[db$Col2 %in% vc,]

Again, I did not get the desired output. Thanks for all the pointers in advance.

820

asked Oct 03 '18 06:10

marine8115

2 Answers

In R, you could do

library(dplyr) 
df %>% 
   group_by(Col1) %>% 
   filter(all(vc %in% Col2))

#   Col1 Col2 
#  <int> <fct>
#1     1 ABC  
#2     1 DEF

The Base R equivalent of that would be

df[as.logical(with(df, ave(Col2, Col1, FUN = function(x) all(vc %in% x)))), ]

#  Col1 Col2
#1    1  ABC
#4    1  DEF

We select the groups which has all of vc in them.

answered Oct 20 '22 23:10

Ronak Shah

In R, we can also use data.table

library(data.table)
setDT(df)[, .SD[all(vc %in% Col2)], by = col1]

173

answered Oct 21 '22 00:10

akrun

Related questions
                            
                                SQL query to group by month part of timestamp
                            
                                mysql number of records in cursor without iterating?
                            
                                How to use aliases in math operators in SQL?
                            
                                A callback for ActiveRecord database connections?
                            
                                Touch MYSQL record to update TIMESTAMP field
                            
                                Implementing dynamically updating upvote/downvote
                            
                                Error while sending STMT_PREPARE packet. PID=2
                            
                                Trying to remove everything before an '@' symbol in MySql Email column
                            
                                Does Laravel close automatically the DB connections?
                            
                                Output raw HTML from Laravel
                            
                                Joining product attributes table with the product table to display product
                            
                                Database not configured laravel during migration
                            
                                phpMyAdmin not showing all rows - wrong number of total records shown
                            
                                Find nearest points with MySQL from points Table
                            
                                use IFNULL in laravel
                            
                                Flask-SQLAlchemy - on the fly connections to multiple databases
                            
                                Datatables + PHP: Server-Side Processing on Multiple Tables
                            
                                Atomic counter - redis vs postgres or other? [closed]
                            
                                extract substring from mysql column using regex
                            
                                Change the Database Connection Dynamically in Laravel [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With