I am trying to apply a RMS function for Accelero-meter data which is in 3 dimension. Also, I have a time stamp column at the beginning which I have kept in days count. So the dataframe is as follows:
0 1 2 3
0 1.963 -12.0 -71.0 -2.0
1 1.963 -11.0 -71.0 -3.0
2 1.963 -14.0 -67.0 -6.0
3 1.963 -16.0 -63.0 -7.0
4 1.963 -18.0 -60.0 -8.0
column '0' is Days, and all the other columns are the 3-axis data of accelero-meter. Right now I am using this approach to compute the RMS value to a new column and drop the existing 3-axis data :
def rms_detrend(x):
return np.sqrt(np.mean(x[1]**2 + x[2]**2 + x[3]**2))
accdf =pd.read_csv(ACC_files[1],header=None)
accdf['ACC_RMS'] = accdf.apply(rms_detrend,axis=1)
accdf = accdf.drop([1,2,3],axis=1)
accdf.columns = accdf['Days','ACC_RMS']
However, I have 70 such files of Accelerometer data each with about 4000+ rows. So is there a better and quicker(pythonic) way to do this ? Thanks. The code above I have done for just one file and its very slow.
Use:
accdf['ACC_RMS'] = np.sqrt(accdf.pop(1)**2 + accdf.pop(2)**2 + accdf.pop(3)**2)
print (accdf)
0 ACC_RMS
0 1.963 72.034714
1 1.963 71.909666
2 1.963 68.709534
3 1.963 65.375837
4 1.963 63.150614
Numpy solution for improve performance:
#[50000 rows x 4 columns]
accdf = pd.concat([accdf] * 10000, ignore_index=True)
In [27]: %timeit (accdf.iloc[:,1:]**2).sum(1).pow(1/2)
1.97 ms ± 89.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [28]: %timeit np.sqrt(np.sum(accdf.to_numpy()[:,1:]**2, axis=1))
202 µs ± 1.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Unfortunately my solution return error for testing, but I guess it is slowier like numpy only solution.
A method from pandas
(df.iloc[:,1:]**2).sum(1).pow(1/2)
Out[26]:
0 72.034714
1 71.909666
2 68.709534
3 65.375837
4 63.150614
dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With