B3095 - Big Data, Data Mining and Data Analytics Workshop Classes - Cesena Campus

Academic Year 2024/2025

  • Moduli: Marco Calbucci (Modulo 1) Stefano Castagnoli (Modulo 2)
  • Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
  • Campus: Cesena
  • Corso: First cycle degree programme (L) in Computer Systems Technologies (cod. 6007)

Learning outcomes

At the end of the course, the student will possess advanced skills and practical abilities related to both relational and non-relational databases, as well as the capability to develop applications centered around the use of DBMS. The student will understand the application areas in which Big Data technologies can be utilized and the associated challenges; will be familiar with the hardware and software architectures proposed for their management; will know the techniques for data storage, and will use the programming languages and paradigms adopted in these types of systems; will understand the design methodologies for different types of applications in the Big Data field. The student will have practical skills in using various tools. They will be familiar with the main techniques of data mining and text mining; will understand project management and development methodologies; and will develop practical skills in generating, analyzing, and interpreting results through hands-on exercises conducted with commercial and/or open-source tools.

Course contents

  • Introduction to Python and the libraries NumPy and Pandas.
  • Statistical concepts and exercises with the Statsmodels library.
  • Data visualization with matplotlib and seaborn.
  • Data collection and cleaning processes.
  • Predictive analysis models (regressions, classifications) with scikit-learn.
  • Embeddings and vector databases

Readings/Bibliography

Foster Provost, Tom Fawcett

Data science for business, O'Reilly (2013)

 

John V. Guttag,

Introduction to computation and programming using Python, The MIT Press (2016)

 

Cole Nussbaumer Knaflic,

Storytelling with data, Wiley (2015)

Teaching methods

Lecture with PowerPoint slide

Lab exercises

Given the type of activity and the teaching methods adopted, attendance at this training activity requires the prior participation of all students in Modules 1 and 2 of training on safety in study environments, in e-learning mode:https://corsi.unibo.it/laurea/TecnologieSistemiInformatici/formazione-obbligatoria-su-sicurezza-e-salute

Assessment methods

Lab exercises

Coding project with Pandas

Oral exam: project discussion and questions on the covered topics

Teaching tools

PowerPoint slides

Anaconda development environment, Visual Studio Code editor

Office hours

See the website of Marco Calbucci

See the website of Stefano Castagnoli