Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a standard way to store XY data in Python?

Is there a standard way to store (x,y), (x,y,z), or (x,y,z,t) data in Python?

I know NumPy arrays are used often for things like this, but I suppose you could do it also with NumPy matrices.

I've seen the use of two lists zipped together, which side steps the use of NumPy altogether.

XY_data = zip( [x for x in range(0,10)] , [y for y in range(0,10)] )

Is there a standard? If not, what is your favorite way, or the one which you have seen the most?

like image 476
chase Avatar asked Apr 04 '13 15:04

chase


Video Answer


1 Answers

One nice way is with a structured array. This gives all the advantages of NumPy arrays, but it has a convenient access structure.

All you need to do to make your NumPy array a "structured" one is to give it the dtype argument. This gives each "field" a name and type. They can even have more complex shapes and hierarchies if you wish, but here's how I keep my x-y data:

In [175]: import numpy as np

In [176]: x = np.random.random(10)

In [177]: y = np.random.random(10)

In [179]: zip(x,y)
Out[179]:
[(0.27432965895978034, 0.034808254176554643),
 (0.10231729328413885, 0.3311112896885462),
 (0.87724361175443311, 0.47852682944121905),
 (0.24291769332378499, 0.50691735432715967),
 (0.47583427680221879, 0.04048957803763753),
 (0.70710641602121627, 0.27331443495117813),
 (0.85878694702522784, 0.61993945461613498),
 (0.28840423235739054, 0.11954319357707233),
 (0.22084849730366296, 0.39880927226467255),
 (0.42915612628398903, 0.19197320645915561)]

In [180]: data = np.array( zip(x,y), dtype=[('x',float),('y',float)])

In [181]: data['x']
Out[181]:
array([ 0.27432966,  0.10231729,  0.87724361,  0.24291769,  0.47583428,
        0.70710642,  0.85878695,  0.28840423,  0.2208485 ,  0.42915613])

In [182]: data['y']
Out[182]:
array([ 0.03480825,  0.33111129,  0.47852683,  0.50691735,  0.04048958,
        0.27331443,  0.61993945,  0.11954319,  0.39880927,  0.19197321])

In [183]: data[0]
Out[183]: (0.27432965895978034, 0.03480825417655464)

Others will probably suggest using Pandas, but if your data is relatively simple, plain NumPy might be easier.

You can add hierarchy if you wish, but often it's more complicated than necessary.

For example:

In [200]: t = np.arange(10)

In [202]: dt = np.dtype([('t',int),('pos',[('x',float),('y',float)])])

In [203]: alldata = np.array(zip(t, zip(x,y)), dtype=dt)

In [204]: alldata
Out[204]:
array([(0, (0.27432965895978034, 0.03480825417655464)),
       (1, (0.10231729328413885, 0.3311112896885462)),
       (2, (0.8772436117544331, 0.47852682944121905)),
       (3, (0.242917693323785, 0.5069173543271597)),
       (4, (0.4758342768022188, 0.04048957803763753)),
       (5, (0.7071064160212163, 0.27331443495117813)),
       (6, (0.8587869470252278, 0.619939454616135)),
       (7, (0.28840423235739054, 0.11954319357707233)),
       (8, (0.22084849730366296, 0.39880927226467255)),
       (9, (0.429156126283989, 0.1919732064591556))],
      dtype=[('t', '<i8'), ('pos', [('x', '<f8'), ('y', '<f8')])])

In [205]: alldata['t']
Out[205]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [206]: alldata['pos']
Out[206]:
array([(0.27432965895978034, 0.03480825417655464),
       (0.10231729328413885, 0.3311112896885462),
       (0.8772436117544331, 0.47852682944121905),
       (0.242917693323785, 0.5069173543271597),
       (0.4758342768022188, 0.04048957803763753),
       (0.7071064160212163, 0.27331443495117813),
       (0.8587869470252278, 0.619939454616135),
       (0.28840423235739054, 0.11954319357707233),
       (0.22084849730366296, 0.39880927226467255),
       (0.429156126283989, 0.1919732064591556)],
      dtype=[('x', '<f8'), ('y', '<f8')])

In [207]: alldata['pos']['x']
Out[207]:
array([ 0.27432966,  0.10231729,  0.87724361,  0.24291769,  0.47583428,
        0.70710642,  0.85878695,  0.28840423,  0.2208485 ,  0.42915613])
like image 151
askewchan Avatar answered Sep 22 '22 12:09

askewchan