- Docente: Francesco Scalone
- Credits: 6
- Language: English
- Teaching Mode: Traditional lectures
- Campus: Rimini
- Corso: Second cycle degree programme (LM) in Statistical, Financial and Actuarial Sciences (cod. 8877)
-
from Feb 10, 2025 to Mar 11, 2025
Learning outcomes
Students will acquire specialised skills to apply data science in fields such as social sciences and demographic analysis. Educational objectives include understanding the fundamentals of data science, using a programming language to analyse demographic and social data, and knowledge of open science principles. Students will be able to implement research projects and communicate results effectively.
Course contents
Course Description:
This course focuses on the application of data science techniques to analyze social science and population data. Using real census data and programming in R, with an introduction to Python, students will learn to manage, analyze, and interpret large datasets to derive meaningful insights and knowledge in the field of social sciences.
Learning Objectives:
By the end of the course, students will be able to:
- Understand and apply fundamental data science techniques in the context of social sciences.
- Process and analyze real census data using R, with an introduction to Python.
- Effectively visualize data to communicate results.
- Develop critical thinking skills to evaluate and interpret data analysis outcomes.
- Apply knowledge to real-world problems in social sciences and population analysis.
- Utilize generative artificial intelligence as a support tool for data science programming.
Course Content:
Introduction to Data Science:
- Overview of data science in social sciences
- Types of data: structured and unstructured
- Data collection methods and sources
- Open data in demographic and social sciences
Data Visualization:
- Principles of effective data visualization
- Creating visualizations with R, with an introduction to Python
- Interpreting and presenting data visually
Exploratory Data Analysis (EDA):
- Techniques for EDA
- Descriptive statistics and data distributions
- Identifying patterns and anomalies
Statistical Analysis:
- Basic statistical concepts
- Hypothesis testing and inferential statistics
- Regression analysis and correlation
Working with Real Census Data:
- Understanding the structure of census data
- Importing and managing large datasets
- Case studies and practical exercises
Applications in Social Sciences:
- Demographic analysis
- Socio-economic indicators
- Public policies and population studies
Introduction to Generative Artificial Intelligence for Data Science:
- Key generative AI models
- Examples of prompt engineering
Prerequisites:
- Basic knowledge of statistics
- Familiarity with programming concepts (experience with R and Python is helpful but not mandatory)
Readings/Bibliography
Hadley Wickham, Garrett Grolemund, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, O'Reilly Media, 2017.
Norman Matloff, The Art of R Programming: A Tour of Statistical Software Design, No Starch Press, 2011.
Teaching methods
Assessment methods
- Discussion of a project work (individually or in groups of up to three people) using real demographic data. For the Project Work, the data to be analyzed will be provided by the instructor and sourced from IPUMS census databases. The analyses will be based on R scripts demonstrated during the lectures. Presentations should be prepared in PowerPoint, and the content will be agreed upon with the instructor.
- Students will receive a pass/fail grade based on their performance and engagement during the preparation and presentation phases of the project work.
Teaching tools
Office hours
See the website of Francesco Scalone