I have a dataframe which is of length 177 and I want to calculate and plot the partial auto-correlation function (PACF).
I have the data imported etc and I do:
from statsmodels.tsa.stattools import pacf
ys = pacf(data[key][array].diff(1).dropna(), alpha=0.05, nlags=176, method="ywunbiased")
xs = range(lags+1)
plt.figure()
plt.scatter(xs,ys[0])
plt.grid()
plt.vlines(xs, 0, ys[0])
plt.plot(ys[1])
The method used results in numbers greater than 1 for very long lags (90ish) which is incorrect and I get a RuntimeWarning: invalid value encountered in sqrtreturn rho, np.sqrt(sigmasq) but since I can't see their source code I don't know what this means.
To be honest, when I search for PACF, all the examples only carry out PACF up to 40 lags or 60 or so and they never have any significant PACF after lag=2 and so I couldn't compare to other examples either.
But when I use:
method="ols"
# or
method="ywmle"
the numbers are corrected. So it must be the algo they use to solve it.
I tried importing inspect and getsource method but its useless it just shows that it uses another package and I can't find that.
If you also know where the problem arises from, I would really appreciate the help.
For your reference, the values for data[key][array] are:
[1131.130005, 1144.939941, 1126.209961, 1107.300049, 1120.680054, 1140.839966, 1101.719971, 1104.23999, 1114.579956, 1130.199951, 1173.819946, 1211.920044, 1181.27002, 1203.599976, 1180.589966, 1156.849976, 1191.5, 1191.329956, 1234.180054, 1220.329956, 1228.810059, 1207.01001, 1249.47998, 1248.290039, 1280.079956, 1280.660034, 1294.869995, 1310.609985, 1270.089966, 1270.199951, 1276.660034, 1303.819946, 1335.849976, 1377.939941, 1400.630005, 1418.300049, 1438.23999, 1406.819946, 1420.859985, 1482.369995, 1530.619995, 1503.349976, 1455.27002, 1473.98999, 1526.75, 1549.380005, 1481.140015, 1468.359985, 1378.550049, 1330.630005, 1322.699951, 1385.589966, 1400.380005, 1280.0, 1267.380005, 1282.829956, 1166.359985, 968.75, 896.23999, 903.25, 825.880005, 735.090027, 797.869995, 872.8099980000001, 919.1400150000001, 919.320007, 987.4799800000001, 1020.6199949999999, 1057.079956, 1036.189941, 1095.630005, 1115.099976, 1073.869995, 1104.48999, 1169.430054, 1186.689941, 1089.410034, 1030.709961, 1101.599976, 1049.329956, 1141.199951, 1183.26001, 1180.550049, 1257.640015, 1286.119995, 1327.219971, 1325.829956, 1363.609985, 1345.199951, 1320.640015, 1292.280029, 1218.890015, 1131.420044, 1253.300049, 1246.959961, 1257.599976, 1312.410034, 1365.680054, 1408.469971, 1397.910034, 1310.329956, 1362.160034, 1379.319946, 1406.579956, 1440.670044, 1412.160034, 1416.180054, 1426.189941, 1498.109985, 1514.680054, 1569.189941, 1597.569946, 1630.73999, 1606.280029, 1685.72998, 1632.969971, 1681.550049, 1756.540039, 1805.810059, 1848.359985, 1782.589966, 1859.449951, 1872.339966, 1883.949951, 1923.569946, 1960.22998, 1930.6700440000002, 2003.369995, 1972.290039, 2018.050049, 2067.560059, 2058.899902, 1994.9899899999998, 2104.5, 2067.889893, 2085.51001, 2107.389893, 2063.110107, 2103.840088, 1972.180054, 1920.030029, 2079.360107, 2080.409912, 2043.939941, 1940.2399899999998, 1932.22998, 2059.73999, 2065.300049, 2096.949951, 2098.860107, 2173.600098, 2170.949951, 2168.27002, 2126.149902, 2198.810059, 2238.830078, 2278.8701170000004, 2363.639893, 2362.719971, 2384.199951, 2411.800049, 2423.409912, 2470.300049, 2471.649902, 2519.360107, 2575.26001, 2584.840088, 2673.610107, 2823.810059, 2713.830078, 2640.8701170000004, 2648.050049, 2705.27002, 2718.3701170000004, 2816.290039, 2901.52002, 2913.97998]
Your time series is pretty clearly not stationary, so that Yule-Walker assumptions are violated.
More generally, PACF is usually appropriate with stationary time series. You might difference your data first, before considering the partial autocorrelations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With