- Docente: Julien Noce Donini
- Crediti formativi: 6
- SSD: FIS/01
- Lingua di insegnamento: Inglese
- Modalità didattica: Convenzionale - Lezioni in presenza
- Campus: Bologna
- Corso: Laurea Magistrale in Advanced Methods in Particle Physics (cod. 5810)
Conoscenze e abilità da conseguire
This course introduces modern methodologies and algorithms to solve complex problems in data analysis with Artificial intelligence. In particular, the student will become familiar with statistic principles; data mining methods and unsupervised learning techniques; regression, classification and clustering alrgorithms, as decision tree and neural network; Finally, the student will be able to write programme to solve simple problems using the methodologies treated in the lectures.
Contenuti
Place of teaching: Université Clermont Auvergne, Clermont-Ferrand
MODULE 1
This course introduces basics of statistics and modern methodologies and algorithms to solve complex problems in data analysis with Artificial intelligence and machine learning (ML). The first part of the lecture covers samples (description and definition of basic quantities: size, dimension, iid, empirical quantities: sample mean, sample variance, quantiles, propagation of uncertainties, binned samples: definition, law of probability), statistical models (definition, ingredients of statistical models: observables, parameters of interest, nuisance parameters, dependent and independent variables, likelihood function and extended likelihood function, composite statistical models, introduction to the treatment of nuisance parameters), inference (introduction to the inference problem, introduction to the frequentist and the Bayesian approaches), and parameter estimation (definition of estimator, properties of estimators: consistency, bias, efficiency, methods for estimating parameters: maximum likelihood, least squares, Bayesian inference). The second part covers basic concepts of machine learning (introduction to ML, deep learning and representation learning, training and testing, cross validation, bias-variance decomposition, curse of dimensionality), regression with linear models (simple exemple: polynomial curve fitting, linear basis function models, regularization, likelihood and regression), and classification (linear models for classification, perceptron algorithm, linear discriminant analysis, logistic regression, Artificial Neural Networks, popular NN algorithms).
MODULE 2
The programming part of the lecture covers a practical introduction (object, collections, functions, loops and few pythonics syntax, basic file manipulation), Numpy introduction (numpy arrays vs python list, vectorization, (fancy) indexing, broadcasting), Data analysis python ecosystem (overview, data representation: matplotlib, import/manipulate data: pandas, mathematics, physics and engineering: scipy), and basics of image processing (loading/plotting, colors, grey scale, image filters: kernel, blocks, sliding windows). The second part of the lecture is about manipulation of data, so-called data mining and includes data preprocessing (data visualization, data cleaning, data space transformation), clustering (hierarchical clustering, partitional clustering), association rules, feature reduction (feature extraction, feature reduction) and hands-on sessions.
Testi/Bibliografia
Scientific literature and specific publications are distributed during the class.
Metodi didattici
MODULE 1
Lecture (50%) and problem-based teaching (50%).
MODULE 2
Lecture (70%) and problem-based teaching (30%).
Modalità di verifica e valutazione dell'apprendimento
Examination: Oral or written examination.
Graded modules
Strumenti a supporto della didattica
Classrooms equipped with computers are used for the hands-on sessions. Python, numpy, Scikit softwares and libraries are used throughout the four elements of the courses.
Orario di ricevimento
Consulta il sito web di Julien Noce Donini