I have a time-series A
holding several values. I need to obtain a series B
that is defined algebraically as follows:
B[t] = a * A[t] + b * B[t-1]
where we can assume B[0] = 0
, and a
and b
are real numbers.
Is there any way to do this type of recursive computation in Pandas? Or do I have no choice but to loop in Python as suggested in this answer?
As an example of input:
> A = pd.Series(np.random.randn(10,))
0 -0.310354
1 -0.739515
2 -0.065390
3 0.214966
4 -0.605490
5 1.293448
6 -3.068725
7 -0.208818
8 0.930881
9 1.669210
A recursive definition of a function defines values of the function for some inputs in terms of the values of the same function for other (usually smaller) inputs. For example, the factorial function n! is defined by the rules. 0! = 1.
Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .
savetxt is still the fastest option, but only by a narrow margin: when benchmarked with pandas 1.3. 0 and numpy 1.20. 3, aa. to_csv() took 2.64 s while savetxt 2.53 s.
As I noted in a comment, you can use scipy.signal.lfilter
. In this case (assuming A
is a one-dimensional numpy array), all you need is:
B = lfilter([a], [1.0, -b], A)
Here's a complete script:
import numpy as np
from scipy.signal import lfilter
np.random.seed(123)
A = np.random.randn(10)
a = 2.0
b = 3.0
# Compute the recursion using lfilter.
# [a] and [1, -b] are the coefficients of the numerator and
# denominator, resp., of the filter's transfer function.
B = lfilter([a], [1, -b], A)
print B
# Compare to a simple loop.
B2 = np.empty(len(A))
for k in range(0, len(B2)):
if k == 0:
B2[k] = a*A[k]
else:
B2[k] = a*A[k] + b*B2[k-1]
print B2
print "max difference:", np.max(np.abs(B2 - B))
The output of the script is:
[ -2.17126121e+00 -4.51909273e+00 -1.29913212e+01 -4.19865530e+01
-1.27116859e+02 -3.78047705e+02 -1.13899647e+03 -3.41784725e+03
-1.02510099e+04 -3.07547631e+04]
[ -2.17126121e+00 -4.51909273e+00 -1.29913212e+01 -4.19865530e+01
-1.27116859e+02 -3.78047705e+02 -1.13899647e+03 -3.41784725e+03
-1.02510099e+04 -3.07547631e+04]
max difference: 0.0
Another example, in IPython, using a pandas DataFrame instead of a numpy array:
If you have
In [12]: df = pd.DataFrame([1, 7, 9, 5], columns=['A'])
In [13]: df
Out[13]:
A
0 1
1 7
2 9
3 5
and you want to create a new column, B
, such that B[k] = A[k] + 2*B[k-1]
(with B[k] == 0
for k < 0), you can write
In [14]: df['B'] = lfilter([1], [1, -2], df['A'].astype(float))
In [15]: df
Out[15]:
A B
0 1 1
1 7 9
2 9 27
3 5 59
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With