Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can we perform common set operations (union, intersection, minus) in MS Excel?

For example, I have an xls where :

  • column A has list of items with property A
  • column B has list of items with property B

I need the following :

  • column C which is A union B (unique items of both A & B)
  • column D which is A intersection B (common items of A & B)
  • column E which is A minus B (items in A but not in B)
  • column F which is B minus A (items in B but not in A)

Set operations on a list of elements seem to be easy with SQL or Python. But how to do it in xls?

Note : It should be an automation with minimal copy-pastes and clicks. For example, I dont want to copy-paste A below B, then "eliminate duplicates" to get A union B.

like image 376
Gaurav Khare Avatar asked Jul 02 '15 13:07

Gaurav Khare


People also ask

How do you do a minus query in Excel?

Subtract numbers in a cell To do simple subtraction, use the - (minus sign) arithmetic operator. For example, if you enter the formula =10-5 into a cell, the cell will display 5 as the result.

How do you do mathematical operations in Excel?

For simple formulas, simply type the equal sign followed by the numeric values that you want to calculate and the math operators that you want to use — the plus sign (+) to add, the minus sign (-) to subtract, the asterisk (*) to multiply, and the forward slash (/) to divide.


3 Answers

Intersection (In A & B): =IFNA(VLOOKUP(B2,$A$2:$B$42,1,FALSE),"")

Union (In A or B): =IFS(A2,A2,B2,B2) Note that IFS is only in Office 2019 and newer editions.

A - B (Only in A): =IF(NOT(IFNA(MATCH(A2,$B$2:$B$42,0),FALSE)),IF(A2,A2,""),"")

B - A (Only in B): =IF(NOT(IFNA(MATCH(B2,$A$2:$A$42,0),FALSE)),IF(B2,B2,""),"") (Swap the letters)

Screenshot of formulas in action, with color coding.

like image 78
Benjamin Riggs Avatar answered Oct 10 '22 18:10

Benjamin Riggs


Excel alone seems not to be able to do the job. However, there are add-ins available. You might want to test the free and open source Power Analytics for Excel. It comes with some functions exactly performing what you asked for:

Screenshot of some supported set operations of Power Analytics for Excel

Usage in Excel 365

In Excel 365 Power Analytics for Excel allows to use dynamic arrays. This feature is exclusively included Excel 365 and not available in t Excel 2019, 2016 and so on.

In the following example we simply write =PA_Sets_And(A2:A11;B2:B6) into the single cell D2 and then - like magic - the formula expands to the required length of three rows.

Usage of Power Analytics for Excel set operations in Excel 365

Usage in Excel 2019, 2016, ...

Here we use the PA_Sets_And method in order to look up whether the cell (B2) is contained in the range of the whole set (A2:A11). Not as cool as for Excel 365 but a little bit nicer than VLOOKUP :-)

Usage of Power Analytics for Excel set operations in Excel 365

like image 30
Jan Rothkegel Avatar answered Oct 10 '22 17:10

Jan Rothkegel


Well, Microsoft Excel does not handle built-in set operations. But you can emulate then by VBA using MATCH function and error handling.

Here is the code that worked for me (I presume that you have heading on the first line):

Sub set_operations()
    Dim i, j, rangeA, rangeB, rowC, rowD, rowE, rowF As Long
    Dim test1, test2 As Boolean

    rangeA = ActiveSheet.Range("A" & CStr(ActiveSheet.Rows.Count)).End(xlUp).Row()
    rangeB = ActiveSheet.Range("B" & CStr(ActiveSheet.Rows.Count)).End(xlUp).Row()
    rowC = 2
    rowD = 2
    rowE = 2
    rowF = 2
    test1 = False
    test2 = False
    test2 = False

    'A union B
    On Error GoTo errHandler1
    For i = 2 To rangeA
        If Application.Match(ActiveSheet.Cells(i, 1), ActiveSheet.Range("C:C"), 0) > 0 Then
            If test1 = True Then
                ActiveSheet.Cells(rowC, 3) = ActiveSheet.Cells(i, 1)
                rowC = rowC + 1
            End If
        End If
        test1 = False
    Next i
    For j = 2 To rangeB
        If Application.Match(ActiveSheet.Cells(j, 2), ActiveSheet.Range("C:C"), 0) > 0 Then
            If test1 = True Then
                ActiveSheet.Cells(rowC, 3) = ActiveSheet.Cells(j, 2)
                rowC = rowC + 1
            End If
        End If
        test1 = False
    Next j

    'A intersection B
    For i = 2 To rangeA
        On Error GoTo errHandler2
        If Application.Match(ActiveSheet.Cells(i, 1), ActiveSheet.Range("B:B"), 0) > 0 Then
            On Error GoTo errHandler1
            If Application.Match(ActiveSheet.Cells(i, 1), ActiveSheet.Range("D:D"), 0) > 0 Then
                If test1 = True And test2 = False Then
                    ActiveSheet.Cells(rowD, 4) = ActiveSheet.Cells(i, 1)
                    rowD = rowD + 1
                End If
            End If
        End If
        test1 = False
        test2 = False
    Next i

    'A minus B
    For i = 2 To rangeA
        On Error GoTo errHandler2
        If Application.Match(ActiveSheet.Cells(i, 1), ActiveSheet.Range("B:B"), 0) > 0 Then
            On Error GoTo errHandler1
            If Application.Match(ActiveSheet.Cells(i, 1), ActiveSheet.Range("E:E"), 0) > 0 Then
                If test1 = True And test2 = True Then
                    ActiveSheet.Cells(rowE, 5) = ActiveSheet.Cells(i, 1)
                    rowE = rowE + 1
                End If
            End If
        End If
        test1 = False
        test2 = False
    Next i

    'B minus A
    For i = 2 To rangeB
        On Error GoTo errHandler2
        If Application.Match(ActiveSheet.Cells(i, 2), ActiveSheet.Range("A:A"), 0) > 0 Then
            On Error GoTo errHandler1
            If Application.Match(ActiveSheet.Cells(i, 2), ActiveSheet.Range("F:F"), 0) > 0 Then
                If test1 = True And test2 = True Then
                    ActiveSheet.Cells(rowF, 6) = ActiveSheet.Cells(i, 2)
                    rowF = rowF + 1
                End If
            End If
        End If
        test1 = False
        test2 = False
    Next i


errHandler1:
    test1 = True
    Resume Next

errHandler2:
    test2 = True
    Resume Next
End Sub
like image 22
Fernando Macedo Avatar answered Oct 10 '22 18:10

Fernando Macedo