B5176 - BIG DATA ANALYTICS

Academic Year 2024/2025

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Business Administration (cod. 0897)

Learning outcomes

Al termine del corso, lo studente conosce i modelli statistici che sono alla base dell'attività di estrazione di conoscenza da grandi quantità di dati (Big Data). In particolare, lo studente è in grado di: - strutturare un processo di data mining; - scegliere, tra gli strumenti metodologici, quelli più adeguati a raggiungere l'obiettivo in esame; - interpretare criticamente i risultati.

Course contents

  1. Multivariate linear models: theory of OLS, Gauss-Markov hypotheses and inference, definition of marginal effects, dummy and categorical variables and interpretation, prediction, model selection, omitted variable bias and inefficiency from irrelevant variables. Nested/Non-nested models. Violation of the hypotheses: residual analysis and specification tests (heteroschedasticity, endogeneity, non-normality), robust OLS, alternative estimators, endogeneity example. Power transformations. Nonlinear models in regressors.
  2. Time series: definition, residual analysis and specification tests (structural break and autocorrelation), robust OLS, time series components, forecasting with classical methods, statistical performance of forecasting methods.

Readings/Bibliography

Main References: 

William Greene (2019), Econometric Analysis, Pearson. Eighth
Edition (Global Edition).

Bradley Efron, Trevor Hastie (2016), Computer Age Statistical Inference: Algorithms, Evidence, and Data Science, Cambridge University Press.

Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Second Edition).

Marno Verbeek (2005), Econometria, I edizione, Zanichelli
Editore.

Gareth James, Daniela Witten, Trevor Hastie, Robert
Tibshirani (2021), An Introduction to Statistical Learning with
Applications in R, Springer.

Additional reference for basic R:

Giuseppe Espa, Rocco Micciolo (2014), Problemi ed Esperimenti di Statistica con R, Apogeo.

Further references:

Tsai Chun-Wei et al. (2015), Big Data Analytics: a survey, Journal of Big Data, 2:21.

 

 

Teaching methods

Lectures are carried out considering both theoretical/methodological and empirical aspects in economics, with the help of the statistical software R.

The used economic datasets are all available in R or provided by the Professor.

Assessment methods

Written examination (about 2 hours)

Potential additional oral examination

 

Teaching tools

PC; video projector.

Office hours

See the website of Anna Gloria Billè