- Docente: Fabio Tamburini
- Credits: 9
- SSD: L-LIN/01
- Language: Italian
- Teaching Mode: Traditional lectures
- Campus: Bologna
- Corso: Second cycle degree programme (LM) in LANGUAGE, SOCIETY AND COMMUNICATION (cod. 0982)
Learning outcomes
The course will examine in detail some of the main topics in Natural Language Processing placing particular emphasis on advanced empirical methods used for linguistic analysis.
Course contents
- Part I: Advanced techniques for corpora handling
- Statistical methods for text analysis
- Regular expressions
- Text annotation techniques
- XML and TEI
- Part II: Building and evaluating applications
- Techniques in Machine Learning
- Methods for evaluating application performances
- Notions of Stylometry and Dialectometry
- Part III: Formal grammars for language analysis
- Formal languages and natural language
- Context-free grammars
- Categorial grammars for syntactic analysis
- Meaning representation
- Categorial Type Logic
Readings/Bibliography
- Lenci, A., Montemagni, S. and Pirrelli, V. (2005). Testo e
computer. Carocci.
- Ritchie C. and Mellish C. (2000). Techniques in Natural
Language Processing.
- Oakes M.P. (1998). Statistics for Corpus Linguistics.
Edimburgh Textbooks in Empirical Linguistics.
- Mitkow R. (ed.) (2003). The Oxford Handbook of Computational
Linguistics.
- D. Jurafsky and J.H. Martin (in press). Speech and Language
Processing, 2nd ed., Prentice Hall.
Students who never attended an introductory course in
Computational Linguistics are strongly encouraged to read the
following materials BEFORE the beginning of the course:
- Lenci A., Montemagni S. and Pirrelli V. (2005). Testo e computer. Carocci. [Cap. 1, 7, Par. 8, 8.1, 8.2, 8.3]
- Chiari I. (2007). Introduzione alla linguistica computazionale. Laterza. [Cap 1, 2]
Teaching methods
Face-to-face classes and laboratory sessions for 30 hours.
Assessment methods
Oral colloquium.
Teaching tools
The course web site is the central point for any kind of information about the course. It contains the handouts and the readings discussed during the lessons as well as a rich software repository useful for laboratory practice.
A CD-ROM has been prepared for the students containing a complete computing environment to practice with the procedures proposed during the course. This tool will be used also in the laboratory sessions.
Links to further information
http://corpora.dslo.unibo.it/LingCompLM_LET/
Office hours
See the website of Fabio Tamburini