Student authentication

Is it the first time you are entering this system?
Use the following link to activate your id and create your password.
»  Create / Recover Password


Updated A.Y. 2019-2020

Course Description

This is a course in categorical microeconometrics. It covers some statistical techniques for categorical data analysis. The R software for statistical computing will be also introduced and used throughout.

We will start with practicums dedicated to an introduction of R software. The objective is that of giving students the ability to perform data preparation and analysis. 

We will then introduce models for binary and count outcomes. These Generalized Linear Models are a modification of linear regression for the case of categorical outcomes. As an example, suppose you want to predict the risk that a firm is about to bankrupt. This can be done by using data that can be measured for everyone (number of employees, liquidity ratio, profitability ratio, leverage ratio, etc.) and use these to predict bankrupt using a sample of firms with known status. Incidentally, you will also understand how the profitability ratio affects the risk of bankrupt.

The course will also discuss methods more tailored for prediction, also with multi-category outcomes, some basic ideas of causal analysis, and how to deal with dependent clustered data. Clustered data arise when units are dependent in groups (e.g., students within classes) or when there are repeated measurements for the same units (e.g., over time with panel data).

A detailed syllabus is available for registered students together with the course materials. 


Introduction to the R software

Categorical data: multi-way tables, probability distributions, the generalized linear model

Principles of causal analysis: confounders, colliders, mediators. Simpson's Paradox. Lord's Paradox. 

Logistic regression

Poisson regression

Analysis of clustered and panel data: mixed models

Principles of classification for predictive purposes. Classification trees and random forests


Attending students must have passed Quantitative Methods I and Quantitative Methods II from the B.D. in Business Administration and Economics, or similar courses. A good understanding of multiple linear regression models is mandatory. Taking the course without knowledge of basic statistics, probability, statistical inference and the multiple linear regression model is discouraged.

A detailed list of required topics is given in the complete syllabus. 


A slightly too technical but comprehensive book is 

Cameron, A. C. and Trivedi, P. K. (2005) Microeconometrics: Methods and Applications. Cambridge University Press. 

A more introductory book is: 

Alan Agresti (2018) An Introduction to Categorical Data Analysis, 3rd Edition. Wiley

Additional handouts and resources will be posted on the course website. 

Additional suggested reading: 

This book is useful for practicing and learning a bit more about R:

Everitt, B. S. and Hothorn, T. (2006) A Handbook of Statistical Analyses Using R. CRC Press. 

The final evaluation is based on a mandatory group assignment, and a closed-book written exam. There will be no mid-term exam.

The group assignment contributes to 50% of the final mark, and is based on practicum. Students must submit a text file containig the code used to undertake the assignment, and a brief report (no more than 8 pages including tables and graphics). More details are given in the complete syllabus. 

The closed-book written exam will cover the entire course’s program. It will, as usual, be an individual exam which will contribute to the remaining 50% of the student's mark.