Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intersection of two lists, keeping duplicates in the first list

I have two flat lists where one of them contains duplicate values. For example,

array1 = [1,4,4,7,10,10,10,15,16,17,18,20]
array2 = [4,6,7,8,9,10]

I need to find values in array1 that are also in array2, KEEPING THE DUPLICATES in array1. Desired outcome will be

result = [4,4,7,10,10,10]

I want to avoid loops as actual arrays will contain over millions of values. I have tried various set and intersect combinations, but just couldn't keep the duplicates..

like image 668
user32147 Avatar asked Oct 30 '14 21:10

user32147


People also ask

Does set intersection remove duplicates?

Answer: C. INTERSECT Returns only the rows that occur in both queries' result sets, sorting them and removing duplicates.

How do I combine lists without duplicates?

You can also merge lists without duplicates in Google Sheets. Select and right-click a second range that will be merged (e.g., C2:C6) and click Copy (or use the keyboard shortcut CTRL + C).

What is the intersection of two lists?

Intersection of two list means we need to take all those elements which are common to both of the initial lists and store them into another list. Now there are various ways in Python, through which we can perform the Intersection of the lists.

How do you find common items between two lists?

Using sets Another approach to find, if two lists have common elements is to use sets. The sets have unordered collection of unique elements. So we convert the lists into sets and then create a new set by combining the given sets.


2 Answers

What do you mean you don't want to use loops? You're going to have to iterate over it one way or another. Just take in each item individually and check if it's in array2 as you go:

items = set(array2)
found = [i for i in array1 if i in items]

Furthermore, depending on how you are going to use the result, consider having a generator:

found = (i for i in array1 if i in array2)

so that you won't have to have the whole thing in memory all at once.

like image 149
anon582847382 Avatar answered Sep 27 '22 18:09

anon582847382


There following will do it:

array1 = [1,4,4,7,10,10,10,15,16,17,18,20]
array2 = [4,6,7,8,9,10]
set2 = set(array2)
print [el for el in array1 if el in set2]

It keeps the order and repetitions of elements in array1.

It turns array2 into a set for faster lookups. Note that this is only beneficial if array2 is sufficiently large; if array2 is small, it may be more performant to keep it as a list.

like image 24
NPE Avatar answered Sep 27 '22 18:09

NPE