Bayesian Inference

Bayes' theorem provides the mathematical basis to the solution of
the problem of estimation. Suppose a discretised system S can be
described by the vector containing the unknown parameter
variables for which one would like to obtain estimates. will be
referred to as the parameter vector. Let us assume that a number of
measurements of the system S are available, being represented by a
vector . Using Bayes' rule, one can write

{\rm p} \left( {\bf x} \vert {\bf d} \right)
= \frac{{\rm p} \left( {\bf d} \vert {\bf x} \right) {\rm p} \left(
{\bf x} \right)}{\displaystyle\int {\rm p} \left( {\bf d} \vert {\bf x}
\right) {\rm p} \left( {\bf x} \right) {\rm d}{\bf x}}

- is the posterior probability density function of . It corresponds to the probability density of the parameter of interest, namely , given the measurements . therefore reflects all of the information provided by the measurements.
- is the likelihood function. It is derived directly from the measurement equation that relates the measurements to the state vector and the noise process that corrupts them. In most scenarios, this function can be evaluated without much conceptual or computational difficulty.
- is the prior probability density function of . It represents the knowledge of the state vector prior to the assimilation of the measurements.

Application: Signal Identification

In this example, Bayesian inference will be used for identification of a noise-contaminated sinusoidal signal. More specifically, the problem involves identifying the parameters for

y\left( t\right) = a_1 {\rm cos} \left( a_3 t\right) + a_2 {\rm sin} \left( a_3 t\right)

The noisy measurements , are given by

y_k = a_1 {\rm cos} \left( a_3 t_k\right) + a_2 {\rm sin} \left( a_3 t_k\right)
+ n_k = \hat{y}_k + n_k

where are independent and identically distributed (IID) Gaussian random variables with zero mean and variance equal to , to model Gaussian white noise. According to Bayes' theorem, the posterior pdf of is proportional to the product of likelihood function and prior pdf:

{\rm p} (a_1, a_2, a_3 \vert y_1, \hdots, y_N) \ \propto \ {\rm p}(y_1, \hdots, y_N \vert a_1, a_2, a_3) {\rm p}(a_1, a_2, a_3)

where the likelihood pdf is given by

{\rm p} (y_1, \hdots, y_N \vert a_1, a_2, a_3) \propto \prod_{i=1}^{N} {\rm p} (y_i \vert a_1, a_2, a_3)

For this example, as the data is sufficiently dense in time, a flat prior will be used, i.e. , and thus the posterior pdf is equal to the likelihood pdf. In certain cases, the likelihood function needs to be complemented by an informative prior to provide a meaningful estimate of the parameters. This depends on how much information regarding the unknown parameters can one extract from the measurements. In this case, it will be assumed that the available observations contain enough information to provide meaningful estimates. Possible realizations of the true signal are obtained from MCMC simulation of the posterior pdf of . These, along with the marginal posterior distributions of the parameters, are shown below using 100,000 MCMC samples. The black curve represents the true signal, for which only noisy observations are available (shown using red circles). One can observe the unimodality in the marginal pdfs, indicating the well-posedness of the problem.