Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy array of objects

I'm trying to implement a simulation for a lattice model (lattice boltzmann) in Python. Each site of the lattice has a number of properties, and interact with neighboring sites according to certain rules. I figured that it might be clever to make a class with all the properties and make a grid of instances of that class. (As I'm inexperienced with Python, this might not be a good idea at all, so feel free to comment on my approach.)

Here is a toy example of what I'm doing

class site:     def __init__(self,a,...):         self.a = a         .... other properties ...     def set_a(self, new_a):         self.a = new_a 

Now I want to deal with a 2D/3D lattice (grid) of such sites so I tried to do the following (here is a 2D 3x3 grid as an example, but in simulation I would need the order of >1000x1000X1000)

lattice = np.empty( (3,3), dtype=object) lattice[:,:] = site(3) 

Now, the problem is that each lattice point refer to the same instance, for example

lattice[0,0].set_a(5) 

will also set the value of lattice[0,2].a to 5. This behavior is unwanted. To avoid the problem i can loop over each grid point and assign the objects element by element, like

for i in range(3):     for j in range(3):         lattice[i,j] = site(a) 

But is there a better way (not involving the loops) to assign objects to a multidimensional array?

Thanks

like image 851
jonalm Avatar asked Feb 02 '11 17:02

jonalm


People also ask

What is the an array of objects?

The array of objects represent storing multiple objects in a single name. In an array of objects, the data can be accessed randomly by using the index number. Reduce the time and memory by storing the data in a single variable.

What is array object in NumPy called?

The most important object defined in NumPy is an N-dimensional array type called ndarray.

Are NumPy arrays faster than lists?

Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.

What is array of objects give an example?

Each variable or object in an array is called an element. Unlike stricter languages, such as Java, you can store a mixture of data types in a single array. For example, you could have array with the following four elements: an integer, a window object, a string and a button object.


2 Answers

You can vectorize the class's __init__ function:

import numpy as np  class Site:     def __init__(self, a):         self.a = a     def set_a(self, new_a):         self.a = new_a  vSite = np.vectorize(Site)  init_arry = np.arange(9).reshape((3,3))  lattice = np.empty((3,3), dtype=object) lattice[:,:] = vSite(init_arry) 

This may look cleaner but has no performance advantage over your looping solution. The list comprehension answers create an intermediate python list which would cause a performance hit.

like image 67
Paul Avatar answered Sep 26 '22 02:09

Paul


The missing piece for you is that Python treats everything as a reference. (There are some "immutable" objects, strings and numbers and tuples, that are treated more like values.) When you do

lattice[:,:] = site(3) 

you are saying "Python: make a new object site, and tell every element of lattice to point to that object." To see that this is really the case, print the array to see that the memory addresses of the objects are all the same:

array([[<__main__.Site object at 0x1029d5610>,         <__main__.Site object at 0x1029d5610>,         <__main__.Site object at 0x1029d5610>],        [<__main__.Site object at 0x1029d5610>,         <__main__.Site object at 0x1029d5610>,         <__main__.Site object at 0x1029d5610>],        [<__main__.Site object at 0x1029d5610>,         <__main__.Site object at 0x1029d5610>,         <__main__.Site object at 0x1029d5610>]], dtype=object) 

The loop way is one correct way to do it. With numpy arrays, that may be your best option; with Python lists, you could also use a list comprehension:

lattice = [ [Site(i + j) for i in range(3)] for j in range(3) ] 

You can use a list comprehension with the numpy.array construction:

lattice = np.array( [ [Site(i + j) for i in range(3)] for j in range(3) ],                     dtype=object) 

Now when you print lattice, it's

array([[<__main__.Site object at 0x1029d53d0>,         <__main__.Site object at 0x1029d50d0>,         <__main__.Site object at 0x1029d5390>],        [<__main__.Site object at 0x1029d5750>,         <__main__.Site object at 0x1029d57d0>,         <__main__.Site object at 0x1029d5990>],        [<__main__.Site object at 0x1029d59d0>,         <__main__.Site object at 0x1029d5a10>,         <__main__.Site object at 0x1029d5a50>]], dtype=object) 

so you can see that every object in there is unique.

You should also note that "setter" and "getter" methods (e.g., set_a) are un-Pythonic. It's better to set and get attributes directly, and then use the @property decorator if you REALLY need to prevent write access to an attribute.

Also note that it's standard for Python classes to be written using CamelCase, not lowercase.

like image 21
Seth Johnson Avatar answered Sep 25 '22 02:09

Seth Johnson