- Docente: Matteo Farnè
- Crediti formativi: 10
- SSD: SECS-S/01
- Lingua di insegnamento: Inglese
- Moduli: Matteo Farnè (Modulo 1) Matteo Farnè (Modulo 2)
- Modalità didattica: Convenzionale - Lezioni in presenza (Modulo 1) Convenzionale - Lezioni in presenza (Modulo 2)
- Campus: Bologna
- Corso: Laurea Magistrale in Statistica, economia e impresa (cod. 8876)
-
Orario delle lezioni (Modulo 1)
dal 17/09/2024 al 22/10/2024
-
Orario delle lezioni (Modulo 2)
dal 12/11/2024 al 13/12/2024
Conoscenze e abilità da conseguire
Aim of the course is to learn the fundamentals of the most important multivariate techniques that help to make intelligent use of large data base by recognizing patterns for predicting or estimating an output based on one or more inputs. At the end of the course the student is able; - to represent and organize knowledge about big data collections; - to turn data into actionable knowledge; - to choose the best suited methodology for the problem at hand to critically interpret the results.
Contenuti
Introduction to Supervised Statistical Learning
Resampling methods:
- Cross-Validation
- Bootstrap
Classification:
- Naive Bayes
- k-Nearest Neighbours
- Logistic Regression
- Linear Discriminant Analysis
Dimension Reduction:
- Principal Component Analysis
- Principal Component Regression and Partial Least Squares
- Factor model: definition and estimation
- Clustering using R software
Regularisation:
- Lasso and Ridge Regression
Smoothing methods in R software
Tree-based methods
- Regression and Classification trees
- Bagging; Random Forests; Boosting
Machine learning methods
- Support Vector Machines
- Neural Networks
Testi/Bibliografia
Slides in PDF and codes in R will be provided by the lecturer for each argument in Virtuale.
The primary text behind the course arguments is:
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to Statistical Learning. Second Edition. New York: Springer. ISBN: 978-1-0716-1417-4. E-book ISBN 978-1-0716-1418-1
The book is freely available here:
https://hastie.su.domains/ISLR2/ISLRv2_website.pdf
In addition, we will use:
- T. Hastie, R. Tibshirani, and J. Friedman (2001) The Elements of Statistical Learning: data mining, inference and prediction. Springer Verlag.
Freely available at: https://web.stanford.edu/~hastie/Papers/ESLII.pdf
Metodi didattici
Theoretical lectures and practical sessions in R Studio.
Modalità di verifica e valutazione dell'apprendimento
The learning assessment will be by a written test lasting between 60 and 90 minutes. The test will be composed of theoretical and practical questions, aimed at assessing the student's knowledge of explained statistical methods and the student's ability to perform statistical analyses and to interpret the resulting outputs in R Studio. The final grade is out of thirty.
During the written exam, students can only use the cheat sheet that is provided on virtuale.unibo.it, containing references to R packages and functions. Students cannot make use of the textbook, personal notes and mobile phones (smart watch or similar electronic data storage or communication devices are not allowed either).
Students that, despite having passed the exam, do not feel represented by the obtained result can ask to have an additional (optional) oral exam that can change the grade by +/-3 points.
Strumenti a supporto della didattica
The students will be provided with course slides, commented R codes, mock exams. Electronic tablet will be used during lectures. The R files related to the primary textbook will be exploited from the website https://www.statlearning.com/resources-second-edition.
Orario di ricevimento
Consulta il sito web di Matteo Farnè