Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract tuples with specified common values in another column in SQL

Tags:

mysql

r

I have a dataset that look like:

 Col1    Col2    
 1        ABC 
 2        DEF
 3        ABC 
 1        DEF 

Expected output:

Col1     Col2    
 1        ABC 
 1        DEF

I want to extract only those IDSs from Col1 which have both values ABC and DEF in the column.

I tried the self-join in SQL but that did not give me the expected result.

SELECT DISTINCT Col1
FROM db A, db B
WHERE A.ID <> B.ID
    AND A.Col2 = 'ABC'
    AND B.Col2 = 'DEF' 
GROUP BY A.Col1

Also, I tried to the same thing in R using the following code:

vc <- c("ABC", "DEF")
data1 <- db[db$Col2 %in% vc,]

Again, I did not get the desired output. Thanks for all the pointers in advance.

like image 820
marine8115 Avatar asked Oct 03 '18 06:10

marine8115


People also ask

How do I get the same value in a column in SQL?

Find duplicate values in one column First, use the GROUP BY clause to group all rows by the target column, which is the column that you want to check duplicate. Then, use the COUNT() function in the HAVING clause to check if any group have more than 1 element. These groups are duplicate.

How do you find a column with the same value?

Using the GROUP BY clause to group all rows by the target column(s) – i.e. the column(s) you want to check for duplicate values on. Using the COUNT function in the HAVING clause to check if any of the groups have more than 1 entry; those would be the duplicate values.

How can I get last record with same ID in SQL?

SELECT * FROM test AS r WHERE r.ID in (SELECT ID FROM test WHERE status = 'Open'); But that will return all the records (ID) having "Open" in the database.

Which of these is used to put the same value in all the rows?

The GROUP BY Statement in SQL is used to arrange identical data into groups with the help of some functions. i.e if a particular column has same values in different rows then it will arrange these rows in a group.


2 Answers

In R, you could do

library(dplyr) 
df %>% 
   group_by(Col1) %>% 
   filter(all(vc %in% Col2))

#   Col1 Col2 
#  <int> <fct>
#1     1 ABC  
#2     1 DEF  

The Base R equivalent of that would be

df[as.logical(with(df, ave(Col2, Col1, FUN = function(x) all(vc %in% x)))), ]

#  Col1 Col2
#1    1  ABC
#4    1  DEF

We select the groups which has all of vc in them.

like image 22
Ronak Shah Avatar answered Oct 20 '22 23:10

Ronak Shah


In R, we can also use data.table

library(data.table)
setDT(df)[, .SD[all(vc %in% Col2)], by = col1]
like image 173
akrun Avatar answered Oct 21 '22 00:10

akrun