- Docente: Edoardo Redivo
- Crediti formativi: 6
- SSD: SECS-S/01
- Lingua di insegnamento: Inglese
- Modalità didattica: Convenzionale - Lezioni in presenza
- Campus: Bologna
- Corso: Laurea in Scienze statistiche (cod. 8873)
-
dal 11/11/2024 al 17/12/2024
Conoscenze e abilità da conseguire
At the end of the course the student will know the methods for linking the information referred to the same statistical unit. This information belongs to different archives and the statistical unit is not identified by means of a code free of errors. The student will be able to use the exact matching, by means of deterministic and probabilistic record linkage and the basic tools of statistical matching.
Contenuti
- The statistical formalisation of the record linkage problem
- Deterministic record linkage
- String similarity functions
- Blocking
- Fellegi-Sunter procedure
- Latent class model and its estimation via the EM algorithm
- Record linkage as an assignment problem
- Supervised classification for record linkage tasks
- More recent developments and Bayesian models for record linkage
Testi/Bibliografia
Christen, P. (2012). Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer. ISBN: 978-3-642-43001-5.
Herzog, N., Scheuren, F. J., Winkler, W. E. (2007). Data Quality and Record Linkage Techniques. Springer. ISBN: 978-0-387-69502-0.
Binette, O., Steorts R. (2022). (Almost) All of Entity Resolution. Science Advances 8 (12): https://doi.org/10.1126/sciadv.abi8021.
Metodi didattici
Lectures and tutorials in R.
To participate in computer lab sessions, students must complete Modules 1 and 2 of health and safety training, available as online courses.
Modalità di verifica e valutazione dell'apprendimento
Written exam with the use of R that covers both practical and theoretical exercises.
Paper notes and resources are allowed, while electronic and online resources are not.
Strumenti a supporto della didattica
Slides and blackboard.
Orario di ricevimento
Consulta il sito web di Edoardo Redivo