I'm starting to learn Stan.
Could anyone explain when and how to use syntax such as... ?
target +=
instead of just:
y ~ normal(mu, sigma)
For example in Stan manual you can find the following example.
model {
real ps[K]; // temp for log component densities
sigma ~ cauchy(0, 2.5);
mu ~ normal(0, 10);
for (n in 1:N) {
for (k in 1:K) {
ps[k] = log(theta[k])
+ normal_lpdf(y[n] | mu[k], sigma[k]);
}
target += log_sum_exp(ps);
}
}
I think the target line increases the target value, that I think it's the logarithm of the posterior density.
But the posterior density for what parameter?
When is it updated and initialized?
After Stan finishes (and converges), how do you access its value and how I use it?
Other examples:
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
vector[J] eta;
}
transformed parameters {
vector[J] theta;
theta = mu + tau * eta;
}
model {
target += normal_lpdf(eta | 0, 1);
target += normal_lpdf(y | theta, sigma);
}
the example above uses target twice instead of just once.
another example.
data {
int<lower=0> N;
vector[N] y;
}
parameters {
real mu;
real<lower=0> sigma_sq;
vector<lower=-0.5, upper=0.5>[N] y_err;
}
transformed parameters {
real<lower=0> sigma;
vector[N] z;
sigma = sqrt(sigma_sq);
z = y + y_err;
}
model {
target += -2 * log(sigma);
z ~ normal(mu, sigma);
}
This last example even mixes both methods.
To do it even more difficult I've read that
y ~ normal(0,1);
has the same effect than
increment_log_prob(normal_log(y,0,1));
Could anyone explain why, please?
Could anyone provide a simple example written in two different ways, with "target +=" and in the regular simpler "y ~" way, please?
Regards
The syntax
target += u;
adds u to the target log density.
The target density is the density from which the sampler samples and it needs to be equal to the joint density of all the parameters given the data up to a constant (which is usually achieved via Bayes's rule by coding as the joint density of parameters and modeled data up to a constant). You access it as lp__ in the posterior, but be careful, as it also contains the Jacobians arising from the constraints and drops constants in sampling statements---you do not want to use it for model comparison.
From a sampling perspective, writing
target += normal_lpdf(y | mu, sigma);
has the same effect as
y ~ normal(mu, sigma);
The _lpdf signals it's the log probability density function for the normal, which is implicit in the sampling notation. The sampling notation is just shorthand for the target += syntax, and in addition, drops constant terms in the log density.
It's explained in the statements section of the language reference (the second part of the manual) and used in multiple examples through the programmer's guide (the first part of the manual).
I am just starting to learn Stan and Bayesian statistics, and mainly rely on John Kruschke's book "Doing Bayesian Data Analysis". Here, in chapter 14.3.3, he explains:
Thus, the essence of computation in Stan is dealing with the logarithm of the posterior probability density and its gradient; there is no direct random sampling of parameters from distributions.
As a result (still rephrasing Kruschke), a
model [...] like
y ∼ normal(mu,sigma)
[actuallly] means to multiply the current posterior probability by the density of the normal distribution at the datum value y.
Following logarithm calculation rules, this multiplication is equal to add the log probability density of a given data y
to the current log-probability. (log(a*b) = log(a) + log(b)
, hence the equality of multiplication and sum).
I concede that I don't grasp the full implications of that, but I think it points into the right direction into what, mathematically speaking, the targer +=
does.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With