Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing two arrays & get the values which are not common

i wanted a small logic to compare contents of two arrays & get the value which is not common amongst them using powershell

example if

$a1=@(1,2,3,4,5) $b1=@(1,2,3,4,5,6) 

$c which is the output should give me the value "6" which is the output of what's the uncommon value between both the arrays.

Can some one help me out with the same! thanks!

like image 655
PowerShell Avatar asked Jun 16 '11 07:06

PowerShell


People also ask

How do you compare two double arrays?

To compare two double arrays use, static boolean equals(double array1[], double array2[]) method of Arrays class. if they contain same elements in same order.

How do I compare two arrays in a list?

The ArrayList. equals() is the method used for comparing two Array List. It compares the Array lists as, both Array lists should have the same size, and all corresponding pairs of elements in the two Array lists are equal. Parameters: This function has a single parameter which is an object to be compared for equality.

Can we directly compare two array?

Java provides a direct method Arrays. equals() to compare two arrays. Actually, there is a list of equals() methods in the Arrays class for different primitive types (int, char, ..etc) and one for Object type (which is the base of all classes in Java).

Can we compare two arrays in JavaScript?

While JavaScript does not have an inbuilt method to directly compare two arrays, it does have inbuilt methods to compare two strings. Strings can also be compared using the equality operator. Therefore, we can convert the arrays to strings, using the Array join() method, and then check if the strings are equal.


2 Answers

PS > $c = Compare-Object -ReferenceObject (1..5) -DifferenceObject (1..6) -PassThru PS > $c 6 
like image 111
Shay Levy Avatar answered Oct 02 '22 05:10

Shay Levy


Collection

$a = 1..5 $b = 4..8 

$Yellow = $a | Where {$b -NotContains $_}

$Yellow contains all the items in $a except the ones that are in $b:

PS C:\> $Yellow 1 2 3 

$Blue = $b | Where {$a -NotContains $_}

$Blue contains all the items in $b except the ones that are in $a:

PS C:\> $Blue 6 7 8 

$Green = $a | Where {$b -Contains $_}

Not in question, but anyways; Green contains the items that are in both $a and $b.

PS C:\> $Green 4 5 

Note: Where is an alias of Where-Object. Alias can introduce possible problems and make scripts hard to maintain.


Addendum 12 October 2019

As commented by @xtreampb and @mklement0: although not shown from the example in the question, the task that the question implies (values "not in common") is the symmetric difference between the two input sets (the union of yellow and blue).

Union

The symmetric difference between the $a and $b can be literally defined as the union of $Yellow and $Blue:

$NotGreen = $Yellow + $Blue

Which is written out:

$NotGreen = ($a | Where {$b -NotContains $_}) + ($b | Where {$a -NotContains $_}) 

Performance

As you might notice, there are quite some (redundant) loops in this syntax: all items in list $a iterate (using Where) through items in list $b (using -NotContains) and visa versa. Unfortunately the redundancy is difficult to avoid as it is difficult to predict the result of each side. A Hash Table is usually a good solution to improve the performance of redundant loops. For this, I like to redefine the question: Get the values that appear once in the sum of the collections ($a + $b):

$Count = @{} $a + $b | ForEach-Object {$Count[$_] += 1} $Count.Keys | Where-Object {$Count[$_] -eq 1} 

By using the ForEach statement instead of the ForEach-Object cmdlet and the Where method instead of the Where-Object you might increase the performance by a factor 2.5:

$Count = @{} ForEach ($Item in $a + $b) {$Count[$Item] += 1} $Count.Keys.Where({$Count[$_] -eq 1}) 

LINQ

But Language Integrated Query (LINQ) will easily beat any native PowerShell and native .Net methods (see also High Performance PowerShell with LINQ and mklement0's answer for Can the following Nested foreach loop be simplified in PowerShell?:

To use LINQ you need to explicitly define the array types:

[Int[]]$a = 1..5 [Int[]]$b = 4..8 

And use the [Linq.Enumerable]:: operator:

$Yellow   = [Int[]][Linq.Enumerable]::Except($a, $b) $Blue     = [Int[]][Linq.Enumerable]::Except($b, $a) $Green    = [Int[]][Linq.Enumerable]::Intersect($a, $b) $NotGreen = [Int[]]([Linq.Enumerable]::Except($a, $b) + [Linq.Enumerable]::Except($b, $a)) 

Benchmark

Benchmark results highly depend on the sizes of the collections and how many items there are actually shared, as a "average", I am presuming that half of each collection is shared with the other.

Using             Time Compare-Object    111,9712 NotContains       197,3792 ForEach-Object    82,8324 ForEach Statement 36,5721 LINQ              22,7091 

To get a good performance comparison, caches should be cleared by e.g. starting a fresh PowerShell session.

$a = 1..1000 $b = 500..1500  (Measure-Command {     Compare-Object -ReferenceObject $a -DifferenceObject $b  -PassThru }).TotalMilliseconds (Measure-Command {     ($a | Where {$b -NotContains $_}), ($b | Where {$a -NotContains $_}) }).TotalMilliseconds (Measure-Command {     $Count = @{}     $a + $b | ForEach-Object {$Count[$_] += 1}     $Count.Keys | Where-Object {$Count[$_] -eq 1} }).TotalMilliseconds  (Measure-Command {     $Count = @{}     ForEach ($Item in $a + $b) {$Count[$Item] += 1}     $Count.Keys.Where({$Count[$_] -eq 1}) }).TotalMilliseconds  [Int[]]$a = $a [Int[]]$b = $b (Measure-Command {     [Int[]]([Linq.Enumerable]::Except($a, $b) + [Linq.Enumerable]::Except($b, $a)) }).TotalMilliseconds 
like image 22
iRon Avatar answered Oct 02 '22 05:10

iRon