Statistics and Data Analysis-S2

Statistics and Data Analysis

Contacts: H. Costantini or C. Schimd

 

Programme:

1. Basics of probability
Axiomatics, frequentist, and Bayesian probability. Discrete and continuous random variables, cumulative and probability distribution function (pdf); conditional and marginalised probability, Bayes formula. Some pdf: uniform, Gaussian, t-Student, chi-square, binomial, Poisson. Central limit theorem, inequalities. Random vectors, covariance matrix, multi-variate pdf. Moments, characteristic function. Transformation of random variables.

2. Basics of statistics
Descriptive statistics; KDE, MEM. Parameters' estimation: methods of maximum likelihood and minimum chi-square; linear regression. Error propagation, confidence intervals, correlations. Test of hypothesis, likelihood ratio, odd ratio, statistical tests (chi-square, t-Student, Fisher-Snedecor). Fisher information matrix. Jackknife, bootstrap.

3. Stochastic processes and optimisation
Time series, Markov process, spatial processes and random fields in 2, 3, and N dimensions. Power spectral density, correlation functions, Wiener-Khintchine theorem. Stochastic differential equations, AR process, Yule-Walker equation, Kalman filter, Feynman-Kac formula. Deterministic optimisation: steepest descent, conjugate gradient, Levenberg-Marquard. Stochastic optimisation: Markov chain Monte Carlo (MCMC), Hamiltonian Monte Carlo, simulated annealing, parallel tempering, importance sampling.

4. Spectral and multivariate analysis: sampling, deconvolution, filtering, classification
Orthogonal polynomials (shapelets, Zernike), Fourier transforms (DFT, FFT, WFT). Sampling: Nyquist-Shannon theorem, periodogram. Integral transforms, Fredholm and Volterra integral equations, Fredholm alternative, resolvent and Nystrom methods, Richardson-Lucy deconvolution. Inverse problem: optimal (Wiener) filtering, regularisation methods (zero-order, Tikhonov/ridge regression). Karhunen-Loève transform, principal component analysis (PCA), linear discriminant analysis (LDA).

 

Exercise sessions:

  • Poisson statistics: counting experiments with background.
  • Histograms, kernel density estimator (KDE), maximum entropy method (MEM).
  • Fitting a polynomial, fitting a straight line with errors in both variables. Goodness-of-fit.
  • Errors by jackknife, bootstrap, Monte Carlo synthesis.
  • Scattering: numerical solution of Volterra equation.
  • Tikhonov regularisation and differentiation of noisy data.
  • Kalman filtering: tracking particles in a detector or satellites in the sky.
  • Image restoration by Wiener filtering.
  • PCA: classification of galaxy spectra, image compression.

 

References:
- Cowan, Statistical Data Analysis (Oxford Science Publications)
- Stark & Murtagh, Handbook of Astronomical Data Analysis (Springer)
- Van Kampen, Stochastic Processes in Physics and Chemistry (Springer)
- Numerical Recipes in Fortran/C/C++