Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting posterior distribution of difference between two variables using PYMC3

Tags:

python

pymc3

Now assume we are looking at daily prices of two stocks, A and B. The prior is simple: the prices are all normal distributed, with mu_A and mu_B both uniformly distributed on [10,100] and sigma_A and sigma_B also uniformly distributed on [1,10]. (I know these are some naive/wrong assumptions - just to make the question clearer.)

Now assume I have observed these two stocks for a month and collected the price data. I can get posterior distribution of A and B separately, but idk how to get the posterior distribution of the difference between the two stocks?

prices_A = [25,20,26,23,30,25]
prices_B = [45,49,52,58,45,48]
basic_model = pm.Model()
with basic_model: 
    mu_A = pm.Uniform('mu_A', lower=10, upper=100)
    sigma_A = pm.Uniform('sigma_A', lower=1, upper=10)
    mu_B = pm.Uniform('mu_B', lower=10, upper=100)
    sigma_B = pm.Uniform('sigma_B', lower=1, upper=10)
    A = pm.Normal('Y_1', mu=mu_A, sd=sigma_A, observed=prices_A)
    B = pm.Normal('Y_2', mu=mu_B, sd=sigma_B, observed=prices_B)
    dif = pm.Deterministic('dif', A-B)
map_estimate = pm.find_MAP(model=basic_model)
map_estimate

However the resulted estimate does not give a distribution of dif to me... Am I confusing the concept of posterior distribution?

like image 389
Cong Ba Avatar asked Nov 28 '25 01:11

Cong Ba


1 Answers

Subtract both variables, you can do it after sampling like:

C = trace['A'] - trace['B']

or you can do it as part of your model using a deterministic variable:

C = pm.Deterministic('C', A - B)

Update:

Now that you have posted your model I will suggest the following

prices_A = [25,20,26,23,30,25]
prices_B = [45,49,52,58,45,48]
basic_model = pm.Model()
with basic_model: 
    mu_A = pm.Uniform('mu_A', lower=10, upper=100)
    sigma_A = pm.Uniform('sigma_A', lower=1, upper=10)
    mu_B = pm.Uniform('mu_B', lower=10, upper=100)
    sigma_B = pm.Uniform('sigma_B', lower=1, upper=10)
    A = pm.Normal('Y_1', mu=mu_A, sd=sigma_A, observed=prices_A)
    B = pm.Normal('Y_2', mu=mu_B, sd=sigma_B, observed=prices_B)
    dif = pm.Deterministic('dif', mu_A-mu_B) # diff of the means
    trace = pm.sample()

 pm.summary(trace)

Basically what I am suggesting is that you do not use find_MAP(), instead sample from the posterior and then from that samples (inside trace) compute what you want. For example summary will give you the mean, standard deviation and other quantities computed from the posterior samples.

You may also want to use sample_ppc to get "posterior predictive samples".

ppc = pm.sample_ppc(trace, 1000, basic_model)
dif_ppc = ppc['Y_1'] - ppc['Y_2']

dif_ppc represents the differences you expect to see for your stocks, including the uncertainty in the means and standard deviations of your stocks.

As a side note, maybe you want to replace your Uniform distributions by other distributions like Normal for the means and HalfNormals for the sigmas.

like image 52
aloctavodia Avatar answered Nov 29 '25 14:11

aloctavodia