- Docente: Luca Clissa
- Credits: 6
- SSD: FIS/07
- Language: English
- Moduli: Luca Clissa (Modulo 1) Claudia Sala (Modulo 2)
- Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
- Campus: Bologna
-
Corso:
Second cycle degree programme (LM) in
Physics (cod. 9245)
Also valid for Second cycle degree programme (LM) in Advanced Methods in Particle Physics (cod. 5810)
Second cycle degree programme (LM) in Physics (cod. 9245)
-
from Sep 16, 2024 to Nov 15, 2024
-
from Nov 11, 2024 to Dec 20, 2024
Learning outcomes
At the end of the course the student will be acquainted with the main statistical concepts used in Physics. After a review of the fundamentals of probability theory, parametric inferential statistics will be introduced, from point estimates and confidence intervals to hypothesis testing and goodness-of-fit. Each item will be addressed both in the Bayesian and frequentist approaches. Dedicated practical sessions will allow the student to become familiar with these conceptual tools by studying applications in Applied Physics.
Course contents
The structure of the course is the following.
For all students:
- Module 1, theory (lecturer L. Clissa)
Only for Applied Physics Students:
- Module 2a, exercises and complements (lecturer C. Sala)
Only for Nuclear and Subnuclear Physics Students:
- Module 2b, exercises and complements (lecturer M. Negrini)
- Module 3b, laboratory (lecturer G. Sirri)
Module 1 Program
-
Probability Concepts
- Definitions: axiomatic, combinatorial, frequentist, and subjectivist
- Conditional probability
- Statistical independence
- Bayes' theorem
-
Random Variables and Distributions
- Probability density/mass function, cumulative probability function
- Multivariate distributions
- Examples of distributions: binomial, multinomial, Poisson, exponential, normal, multivariate normal, chi-squared, Breit-Wigner, Landau
- Marginal and conditional densities
- Functions of random variables
- Characteristic function and distribution moments: expected value, variance, and covariance
- Central Limit Theorem
- Propagation of errors with correlated variables
-
Statistical Inference
- Fisher information
- Sample statistics, test statistics, and sufficient statistics
- Estimators for mean and variance
- Maximum likelihood method
- Multi-parameter estimation with uncertainty and correlations
- Bayesian estimators, Jeffreys priors
- Least squares method
-
Monte Carlo Method
- Convergence criteria
- Law of large numbers
-
Hypothesis Testing
- Simple hypotheses
- Test efficiency and power
- Neyman-Pearson lemma
- Linear test, Fisher discriminant
- Statistical significance, p-values, Look-Elsewhere Effect
- Chi-squared method for hypothesis testing
-
Confidence Intervals
- Exact methods: Gaussian and Poisson cases
- Unified approach
- Bayesian method
- CL method
- Systematic errors and nuisance parameters
- Asymptotic properties
-
Multivariate Methods
- Neural Networks, Boosted Decision Tree
Module 2a Program
Introduction to R and RStudio. Generation of random variables and probability distributions. Law of large numbers. Central Limit Theorem. Hypothesis testing. Student's t-test. Fisher's F-test. p-value: statistical significance and power. Maximum likelihood estimation. Linear regression. Correlation. Analysis of variance. Generalized linear models. Multivariate linear regression. Multicollinearity. Lasso and Ridge penalties.
Module 2b Program
Exercises and supplements on Monte Carlo methods and unfolding.
Module 3b Program
Elements of C++ and ROOT. RooFit workspace, Factory, composite models, multidimensional models. Use of RooStats to calculate confidence intervals, Profile Likelihood, Feldman-Cousins, Bayesian intervals, with and without nuisance parameters. Use of TMVA as a classifier, description of TMVAGui.
Readings/Bibliography
Module 1
- Glen Cowan, Statistical Data Analysis, Oxford Univ. Press, 1998
- (optional, more statistical perspective) Hastie, Trevor, et al. The elements of statistical learning: data mining, inference, and prediction. Vol. 2. New York: springer, 2009.
Module 2a:
- Data Analysis and Graphics using R -an Example-based approach." by John Maindonald and W. John Braun (Cambridge University Press, 2003)
- An Introduction to Statistical Learning with Applications in R." by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (Springer, 2013)
Modules 2b e 3b:
- Glen Cowan, Statistical Data Analysis, Oxford Univ. Press, 1998
- O. Behnke et al., Data Analysis in High Energy Physics: A Practical Guide to Statistical Methods, Wiley, 2013
- A. G. Frodesen, O. Skjeggestad, H. Toft, Probability and Statistics in Particle Physics, Universitetforlaget, 1979
- G. D'Agostini, Bayesian reasoning in data analysis - A critical introduction, World Scientific Publishing, 2003
Teaching methods
Lectures and laboratory sessions using applications for solving practical problems.
Considering the type of activities and teaching methods adopted, attendance in this training activity requires all students attending modules 2a and 3b to have previously completed modules 1 and 2 on safety training in study places (in e-learning mode).
Assessment methods
The exam consists of a written test, lasting two hours, structured as follows:
- three theory questions
- one exercise
- one question for the laboratory part, in which you are asked to comment on a block of code
The tests may be diversified depending on the channel (modules 2a vs modules 2b+3b).
To obtain honors, it is necessary to have obtained 30/30 in the written test and take an additional oral test.
Important: in order to take the written test, it is necessary to have completed and delivered to the teacher the practical laboratory tests, which do not influence the final grade.
Teaching tools
Course slides available at Virtuale. Please contact the professors in case of problems.
Office hours
See the website of Luca Clissa
See the website of Claudia Sala