- Docente: Fabio Tamburini
- Credits: 6
- SSD: L-LIN/01
- Language: Italian
- Teaching Mode: Traditional lectures
- Campus: Bologna
- Corso: First cycle degree programme (L) in Arts (cod. 0958)
Learning outcomes
The course will provide the knowledge related to the basic
processes and methodologies on corpus building and on automatic
text processing.
Course contents
APPLIED LINGUISTICS - 6 CFU:
- CORPORA
- What is a corpus, how to use it and the kind of information it
provides.
- Parameters for corpus design. Representativeness.
- Syntagmatic and paradigmatic analysis.
- Concordances, collocations and lexical association indexes.
- Annotations
- Electronic texts, coding, mark-up format and conversion methods.
- How to collect electronic texts.
- Corpus access and text retrieval.
- Web as corpus.
- Laboratory: building and using a tagged corpus.
APPLIED LINGUISTICS - 12 CFU:
Add the following contents to those of 6CFU:
- NATURAL LANGUAGE PROCESSING
- Phonetics: Speech sample parameters - phones and formants - frequency analysis - suprasegmentals.
- Morphology: Analysis and generation.
- Syntax: Part-of-speech tagging - parsing and formal
grammars.
- Semantics: lexical semantics - Wordnets.
- Laboratory of computational linguistics
Readings/Bibliography
Some chapters extracted from:
- Cresti E., Panunzi, A. (2013). Introduzione ai corpora
dell'italiano, Il Mulino.
- Lenci, A., Montemagni, S. and Pirrelli, V. (2005). Testo e
computer, Carocci.
- McEnery T., Wilson A. (1996). Corpus Linguistics,
Edinburgh University Press.
Slides, handouts and papers downloadable from the course web site
http://corpora.ficlit.unibo.it/LingAppl/.
Teaching methods
Face-to-face classes for 60 hours (12CFU) or 30 hours (6CFU).
Assessment methods
Oral colloquium.
It is compulsory to register for the exam using the online procedure.
Teaching tools
The course web site is the central point for any kind of information about the course. It contains the handouts and the readings discussed during the lessons as well as a rich software repository useful for laboratory practice.
A CD-ROM has been prepared for the students containing a complete computing environment to practice with the procedures proposed during the course. This tool will be used also in the laboratory sessions.
Links to further information
http://corpora.ficlit.unibo.it/LingAppl/
Office hours
See the website of Fabio Tamburini