- Docente: Aldo Gardini
- Credits: 6
- SSD: SECS-S/01
- Language: Italian
- Moduli: Aldo Gardini (Modulo 1) Aldo Gardini (Modulo 2)
- Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
- Campus: Rimini
- Corso: First cycle degree programme (L) in Statistics, Finance and Insurance (cod. 5901)
-
from Sep 16, 2024 to Sep 30, 2024
-
from Oct 02, 2024 to Oct 23, 2024
Learning outcomes
At the end of the course, the candidate is able to handle statisitcal methods to manage large data bases, analyse complex data to support decisions, investments and operate financial and insurace activities. More in details, the student is able to: handle financial and insurance large data bases; select and apply the most appropriate methodology to analyse complex data; critically interpret the obtained empirical results.
Course contents
- Managing large data bases in R: tidyverse and datatable packages. Connections with SQL.
- Data wrangling and visualization. Basic notions of the ggplot2 package.
- Dimensionality reduction. Principal component analysis and eigen-portfolios.
- Classification and regression: machine learning methods. Regression trees, Random Forest.
Readings/Bibliography
C. Wright, S. E. Ellis, S. C. Hicks and R. D. Peng. (2021) Tidyverse Skills for Data Science. https://jhudatascience.org/tidyversecourse/
G. James, D. Witten, T. Hastie, R. Tibshirani (2013). An introduction to statistical learning. New York: Springer.
S. Mignani, A. Montanari (1997). Appunti di analisi statistica multivariata, Esculapio, Bologna, Seconda edizione (Chapter 3: analisi delle componenti principali)
Further material will be provided during the lectures.
Teaching methods
Each topic covered in the lectures will be followed by exercises in practical classes using R.
Assessment methods
The final exam aims at evaluating the achievement of the following objectives:
- Knowledge of the R tool faced during the lab sessions
- Ability to apply statistical methods to real data
- Ability to interpret the obtained results
The final exam consist of a written test of 2 hours in the laboratory, using the R software.
A dataset will be provided and the requested analysis need to be carried out. The lectures scripts will be given to students.
Teaching tools
Slides and R scripts available on Virtuale.
Office hours
See the website of Aldo Gardini