Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove shared references in list-of-list?

Ok, let me explain the problem with a simple example:

l = [[0]]*3       # makes the array [[0], [0], [0]]
l[0][0] = 42      # l becomes [[42], [42], [42]]
from copy import deepcopy
m = deepcopy(l)   # m becomes [[42], [42], [42]]
m[0][0] = 2       # m becomes [[2], [2], [2]]

This is a basic shared reference problem. Except usually, when a problem like this occurs, deepcopy is our friend. Currently, I made this to solve my deepcopy betrayal problem:

l = [[0]]*3       # makes the array [[0], [0], [0]]
import JSON
m = JSON.loads(JSON.dumps(l)) # m becomes [[0], [0], [0]] with no self references

I am looking for a less inefficient and less stupid way of handling self shared reference.

Of course I wouldn't make arrays like that on purpose, but I need to handle the case where someone gives one to my code. Running my "solution" on large arrays is slow, and I have many levels of nested arrays, I can't afford to make a string this big for those beasts.

like image 596
Benoît P Avatar asked Jan 31 '19 18:01

Benoît P


People also ask

How do you remove an item from a list in Python?

How to Remove an Element from a List Using the remove() Method in Python. To remove an element from a list using the remove() method, specify the value of that element and pass it as an argument to the method. remove() will search the list to find it and remove it.

How do you remove an element from a list from another list in Python?

Use list. remove() to remove the elements of a list from another list. Use a for-loop to iterate through each element in list_1 and, if that element is in list_2 , call list. remove(element) with element as the current element to remove element from list_2 .

Why is copy () not working in Python?

Python List Copy Not Working. The main reason why the list. copy() method may not work for you is because you assume that it creates a deep copy when, in reality, it only creates a shallow copy of the list.


4 Answers

Here's an approach that will work on any combination of lists, dicts, and immutable values.

def very_deep_copy(obj):
    if isinstance(obj, list):
        return [very_deep_copy(item) for item in obj]
    elif isinstance(obj, dict):
        return {k: very_deep_copy(v) for k,v in obj.items()}
    else:
        return obj

l = [[0]]*3 
m = very_deep_copy(l)
m[0][0] = 2
print(m)

Result:

[[2], [0], [0]]
like image 174
Kevin Avatar answered Oct 23 '22 16:10

Kevin


I'm going to challenge the assumption that the right thing to do is to copy the shared objects. You say that

Of course I wouldn't make arrays like that on purpose, but I need to handle the case where someone gives one to my code.

but if someone passes you an input with unexpected object sharing, their code has a bug. If your code notices the bug, your code should tell them about it by throwing an exception, to help them fix their bugs.

Most code would just assume the input doesn't have any unwanted object sharing. If you want to detect it anyway, a manual traversal is probably your best option, especially since your input is expected to be JSON-serializable:

def detect_duplicate_references(data):
    _detect_duplicate_references(data, set())

def _detect_duplicate_references(data, memo):
    if isinstance(data, (dict, list)):
        if id(data) in memo:
            raise ValueError(f'multiple references to object {data} detected in input')
        memo.add(id(data))
    if isinstance(data, list):
        for obj in data:
            _detect_duplicate_references(obj, memo)
    if isinstance(data, dict):
        for obj in data.values():
            _detect_duplicate_references(obj, memo)
like image 44
user2357112 supports Monica Avatar answered Oct 23 '22 16:10

user2357112 supports Monica


l = [[0] for _ in range(3)]
l[0][0] = 2
print(l)

prints:

[[2], [0], [0]]

also, btw in your code deepcopy() gave the output it did because you passed in a list which already had elements that share the same reference.

from copy import deepcopy
m = deepcopy(l)
m[0][0] = 3
print(l) # prints [[2], [0], [0]]
print(m) # prints [[3], [0], [0]]

you can see here it does what we expect it to no problem

like image 2
sam46 Avatar answered Oct 23 '22 16:10

sam46


Assuming only structure type will be list, this should work for list of arbitrary depth and complexity:

def customcopy(l):
    return [customcopy(e) if type(e) is list else e for e in l]

l = [[0]]*3
x = customcopy(l)
x[0][0] = 3

>>> x
[[3], [0], [0]]
like image 1
Rocky Li Avatar answered Oct 23 '22 15:10

Rocky Li