Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python basic data references, list of same reference

Say I have two lists:

>>> l1=[1,2,3,4]
>>> l2=[11,12,13,14]

I can put those lists in a tuple, or dictionary, and it appears that they are all references back to the original list:

>>> t=(l1,l2)
>>> d={'l1':l1, 'l2':l2}
>>> id(l1)==id(d['l1'])==id(t[0])
True
>>> l1 is d['l1'] is t[0]
True

Since they are references, I can change l1 and the referred data in the tuple and dictionary change accordingly:

>>> l1.append(5)
>>> l1
[1, 2, 3, 4, 5]
>>> t
([1, 2, 3, 4, 5], [11, 12, 13, 14])
>>> d
{'l2': [11, 12, 13, 14], 'l1': [1, 2, 3, 4, 5]}

Including if I append the reference in the dictionary d or mutable reference in the tuple t:

>>> d['l1'].append(6)
>>> t[0].append(7)
>>> d
{'l2': [11, 12, 13, 14], 'l1': [1, 2, 3, 4, 5, 6, 7]}
>>> l1
[1, 2, 3, 4, 5, 6, 7]

If I now set l1 to a new list, the reference count for the original list decreases:

>>> sys.getrefcount(l1)
4
>>> sys.getrefcount(t[0])
4
>>> l1=['new','list']
>>> l1 is d['l1'] is t[0]
False
>>> sys.getrefcount(l1)
2
>>> sys.getrefcount(t[0])
3

And appending or changing l1 does not change d['l1'] or t[0] since it now a new reference. The notion of indirect references is covered fairly well in the Python documents but not completely.

My questions:

  1. Is a mutable object always a reference? Can you always assume that modifying it modifies the original (Unless you specifically make a copy with l2=l1[:] kind of idiom)?

  2. Can I assemble a list of all the same references in Python? ie, Some function f(l1) that returns ['l1', 'd', 't'] if those all those are referring to the same list?

  3. It is my assumption that no matter what, the data will remain valid so long as there is some reference to it.

ie:

l=[1,2,3]         # create an object of three integers and create a ref to it
l2=l              # create a reference to the same object
l=[4,5,6]         # create a new object of 3 ints; the original now referenced 
                  # by l2 is unchanged and unmoved
like image 243
dawg Avatar asked Dec 29 '10 19:12

dawg


People also ask

Is multiple references to same object possible in Python?

The net effect is that the variables x and y wind up referencing the same object. This situation, with multiple names referencing the same object, is called a Shared Reference in Python. This statement causes the creation of a new object and made y to reference this new object.

Are Python lists passed by reference?

Python passes arguments neither by reference nor by value, but by assignment.

What are shared references?

Typically, a "shared reference" means there's only one object that two or more things have a reference to. Since there's only one object, changing it through one reference has precisely the same effect as changing it through the other.

What is REF () in Python?

A reference is a name that refers to the specific location in memory of a value (object). References take the form of variables, attributes, and items. In Python, a variable or other reference has no intrinsic type.


2 Answers

1) Modifying a mutable object through a reference will always modify the "original". Honestly, this is betraying a misunderstanding of references. The newer reference is just as much the "original" as is any other reference. So long as both names point to the same object, modifying the object through either name will be reflected when accessed through the other name.

2) Not exactly like what you want. gc.get_referrers returns all references to the object.

>>> l = [1, 2]
>>> d = {0: l}
>>> t = (l, )
>>> import gc
>>> import pprint
>>> pprint.pprint(gc.get_referrers(l))
[{'__builtins__': <module '__builtin__' (built-in)>,
  '__doc__': None,
  '__name__': '__main__',
  '__package__': None,
  'd': {0: [1, 2]},
  'gc': <module 'gc' (built-in)>,
  'l': [1, 2],
  'pprint': <module 'pprint' from '/usr/lib/python2.6/pprint.pyc'>,
  't': ([1, 2],)},   # This is globals()

 {0: [1, 2]},  # This is d
 ([1, 2],)]   # this is t

Note that the actual object referenced by l is not included in the returned list because it does not contain a reference to itself. globals() is returned because that does contain a reference to the original list.

3) If by valid, you mean "will not be garbage collected" then this is correct barring a highly unlikely bug. It would be a pretty sorry garbage collector that "stole" your data.

like image 110
aaronasterling Avatar answered Oct 19 '22 07:10

aaronasterling


Every variable in Python is a reference.

For lists, you are focusing on the results of the append() method, and loosing sight of the bigger picture of Python data structures. There are other methods on lists, and there are advantages and consequences to how a list is constructed. It is helpful to think of list as view on to other objects referred to in the list. They do not "containing" anything other than the rules and ways of accessing the data referred to by objects within them.

The list.append(x) method specifically is equivalent to l[len(l):]=[list]

So:

>>> l1=range(3)
>>> l2=range(20,23)
>>> l3=range(30,33)
>>> l1[len(l1):]=[l2]    # equivalent to 'append' for subscriptable sequences 
>>> l1[len(l1):]=l3      # same as 'extend'
>>> l1
[0, 1, 2, [20, 21, 22], 30, 31, 32]
>>> len(l1)
7
>>> l1.index(30)
4
>>> l1.index(20)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: list.index(x): x not in list
>>> 20 in l1
False
>>> 30 in l1
True

By putting the list constructor around l2 in l1[len(l1):]=[l2], or calling l.append(l2), you create a reference that is bound to l2. If you change l2, the references will show the change as well. The length of that in the list is a single element -- the reference to the appended sequence.

With no constructor shortcut as in l1[len(l1):]=l3, you copy each element of the sequence.

If you use other common list methods, such as l.index(something), or in you will not find elements inside of the data references. l.sort() will not sort properly. They are "shallow" operations on the object, and by using l1[len(l1):]=[l2] you are now creating a recursive data structure.

If you use l1[len(l1):]=l3, you are making a true (shallow) copy of the elements in l3.

These are fairly fundamental Python idioms, and most of the time they 'do the right thing.' You can, however, get surprising results, such as:

>>> m=[[None]*2]*3
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=33
>>> m
[[None, 33], [None, 33], [None, 33]]   # probably not what was intended...
>>> m[0] is m[1] is m[2]               # same object, that's why they all changed
True

Some Python newbies try to create a multi dimension by doing something like m=[[None]*2]*3 The first sequence replication works as expected; it creates 2 copies of None. It is the second that is the issue: it creates three copies of the reference to the first list. So entering m[0][1]=33 modifies the list inside the list bound to m and then all the bound references change to show that change.

Compare to:

>>> m=[[None]*2,[None]*2,[None]*2]
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=33
>>> m
[[None, 33], [None, None], [None, None]]

You can also use nested list comprehensions to do the same like so:

>>> m=[[ None for i  in range(2)] for j in range(3)]
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=44
>>> m
[[None, 44], [None, None], [None, None]]
>>> m[0] is m[1] is m[2]                      # three different lists....
False

For lists and references, Fredrik Lundh has this text for a good intro.

As to your specific questions:

1) In Python, Everything is a label or a reference to an object. There is no 'original' (a C++ concept) and there is no distinction between 'reference', pointer, or actual data (a C / Perl concept)

2) Fredrik Lundh has a great analogy about in reference to a question similar to this:

The same way as you get the name of that cat you found on your porch: the cat (object) itself cannot tell you its name, and it doesn't really care -- so the only way to find out what it's called is to ask all your neighbours (namespaces) if it's their cat (object)...

....and don't be surprised if you'll find that it's known by many names, or no name at all!

You can find this list with some effort, but why? Just call it what you are going to call it -- like a found cat.

3) True.

like image 35
the wolf Avatar answered Oct 19 '22 06:10

the wolf