Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matrix operations with rows of pandas dataframes

I have a pandas dataframe that contains three columns corresponding to x, y and z coordinates for positions of objects. I also have a transformation matrix ready to rotate those points by a certain angle. I had previously looped through each row of the dataframe performing this transformation but I found that that is very, very time consuming. Now I just want to perform the transformations all at once and append the results as additional columns.

I'm looking for a working version of this line (which always returns a shape mismatch):

largest_haloes['X_rot', 'Y_rot', 'Z_rot'] = np.dot(rot,np.array([largest_haloes['X'], largest_haloes['Y'], largest_haloes['Z']]).T)

Here's a minimum working example:

from __future__ import division
import math
import pandas as pd
import numpy as np

def unit_vector(vector):
    return vector / np.linalg.norm(vector)


largest_haloes = pd.DataFrame()
largest_haloes['X'] = np.random.uniform(1,10,size=30)
largest_haloes['Y'] = np.random.uniform(1,10,size=30)
largest_haloes['Z'] = np.random.uniform(1,10,size=30)

normal = np.array([np.random.uniform(-1,1),np.random.uniform(-1,1),np.random.uniform(0,1)])
normal = unit_vector(normal)

a = normal[0]
b = normal[1]
c = normal[2]

rot = np.array([[b/math.sqrt(a**2+b**2), -1*a/math.sqrt(a**2+b**2), 0], [(a*c)/math.sqrt(a**2+b**2), b*c/math.sqrt(a**2+b**2), -1*math.sqrt(a**2+b**2)], [a, b, c]])

largest_haloes['X_rot', 'Y_rot', 'Z_rot'] = np.dot(rot,np.array([largest_haloes['X'], largest_haloes['Y'], largest_haloes['Z']]).T)

So the goal is that each row of largest_haloes['X_rot', 'Y_rot', 'Z_rot'] should be populated with a rotated version of the corresponding row of largest_haloes['X','Y','Z']. How can I do this without looping through rows? I've also tried df.dot but there is not much documentation on it and it didn't seem to do what I wanted.

like image 899
Arnold Avatar asked Nov 08 '22 07:11

Arnold


1 Answers

If you mean matrix multiplication by rotation.

You can convert both to numpy arrays and perform it as

lh = largest_haloes.values
rotated_array = lh.dot(rot)

You can also do

x = pd.DataFrame(data=rot,index=['X','Y','Z'])
rotated_df = largest_haloes.dot(x)
like image 120
WannaBeCoder Avatar answered Nov 15 '22 07:11

WannaBeCoder