Statistics and Data Analysis
This 3 day residential DISCnet event (DISC6004) will cover theory and techniques of statistics and data analysis.
Aim
To acquire the skills needed for analysis of experimental data and model fitting.
Objectives
At the end of this course, a successful student will be able to:

Fit models to data, with robust estimates of model parameters, incorporating prior information.

Efficiently explore model parameter spaces.

Make informed choices between different possible models.
Part 1: Statistics
The first part of the school will cover the following statistical methods:

Basics of Bayesian statistics, including rules of probability, Bayesian reasoning and priors, moments and cumulants, common 1D distributions.

Multivariate distributions, including multivariate Gaussian, marginalisation, principal components analysis (PCA), changing variables.

Estimator theory, including bias and variance, Fisher matrices, CramerRao bound.

Applications of Bayesian methods, including template fitting, Wiener filtering, marginalisation over nuisance parameters.

Model selection, including Bayesian evidence and proxies.

Monte Carlo methods, including discussion of Markov chain Monte Carlo (MCMC) techniques , pseudo random number generators, the theory of finite Markov chains in a nutshell, application of Monte Carlo: integrals and the Ising model and a survey of Monte Carlo methods.
Part 2: Data analysis
The second part of the school covers data analysis methods:

Treatment of errors, including experimental errors, weighted averages, covariance matrix, combining errors, nonlinear functions of several variables, change of variables.

Maximum likelihood methods, including leastsquares fitting, linear least squares (uncorrelated measurements), full linear least squares (correlated measurements), nonlinear fitting.

Chi2 distribution, including chi2 testing, error ellipse, comparing mean and variance between samples.
Learning & Teaching Resources
Monte Carlo Simulations:

Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H.; Teller, E. "Equation of State Calculations by Fast Computing Machines". Journal of Chemical Physics 21 (6): 1087, (1953).

Metropolis, N.; Ulam, S., "The Monte Carlo Method". Journal of the American Statistical Association (American Statistical Association) 44 (247): 335–341, (1949).

Madras, N. "Lectures on Monte Carlo Methods". American Mathematical Society (2002).
Data Analysis:

Barlow, R.J.: Statistics (Wiley)

Robinson, E.L.: Data Analysis for Scientists and Engineers (Princeton)

Hogg, Bovy & Lang: Data analysis recipes: Fitting a model to data (http://arxiv.org/abs/1008.4686)
Examples
Examples will be given during the course.
Prerequisites / Linked Modules
It is recommended that students have the following software installed on their laptops:

Anaconda python distribution (https://www.anaconda.com/download/)

emcee, affineinvariant MCMC code (http://dfm.io/emcee/current/)
Approximate hours: taught material + exercises + selfstudy
Each morning and afternoon session will start with a 1hour lecture followed by 2hours of handson exercises.