Login
Student authentication

Is it the first time you are entering this system?
Use the following link to activate your id and create your password.
»  Create / Recover Password

Syllabus

EN IT

Learning Objectives




LEARNING OUTCOMES: The course will discuss big data analysis for empirical exercises
in economics, finance, and business.
KNOWLEDGE AND UNDERSTANDING: Statistical and machine learning tools for
supervised and unsupervised learning with big data will be detailed.
APPLYING KNOWLEDGE AND UNDERSTANDING: We will discuss the main applications,
with some more attention given to text mining. The R software will be used throughout.
MAKING JUDGEMENTS: The student will have to be able to choose the most appropriate
statistical method, interpret its result, and be aware of its limitations.
COMMUNICATION SKILLS: The student will have to be able to communicate the results
through graphs, tables, and comments.
LEARNING SKILLS: The students will be made aware of strengths and limitations of the
methods, and pointed to further reading.

ALESSIO FARCOMENI

Prerequisites

Attending students must have a good knowledge of linear and logistic regression models.
For students of the LM in Economics, it is advised to have at least attended the following
courses: “Statistics”, “Econometrics”, and possibly also “Microeconometrics”.
For students of the LM in Finance and Banking, it is advised to have passed "Mathematics”,
and at least attended “Statistics”and possibly also “Time series and econometrics”.

Program

Characteristics of Big data. Sources of big data and motivating applications: web scraping,
social media, Google. Architectures for big data collection, analysis, and storage. Working
with a large sample size Principles of prediction and tuning parameter choice.
Unsupervised Learning: k-means, PAM, trimmed k-means. Supervised learning:
regularization and feature selection for linear and non-linear regression models. Ridge
regression, LASSO, elastic net. Machine learning approaches for supervised learning:
k-nearest-neighbors, classification and regression trees, random forests. An overview of
neural networks and deep learning. Images, sounds, text, as sources of information. Text
mining: natural language processing, latent Dirichlet allocation, sentiment analysis.

Books

Brad Boehmke, Brandon Greenwell (2019) Hands-on Machine Learning with R, Chapman
& Hall/CRC Press
Hastie T., Tibshirani R., Friedman J. (2009). The Elements of Statistical Learning: Data
Mining, Inference, and Prediction, Second Edition. Springer, Springer Series in Statistics.

Bibliography

Brad Boehmke, Brandon Greenwell (2019) Hands-on Machine Learning with R, Chapman
& Hall/CRC Press
Hastie T., Tibshirani R., Friedman J. (2009). The Elements of Statistical Learning: Data
Mining, Inference, and Prediction, Second Edition. Springer, Springer Series in Statistics.

Teaching methods

The course is carried out through lectures and practicums. Techniques will be introduced
by examples and described in mathematical formulas. Focus will be on the practical
use of each technique, and interpretation of the results.

Exam Rules

The exam will be written, with a mix of open and closed form questions. Questions will cover the entire course material. Some questions will report either R code or R output, and will pertain the interpretation of the same.

Students must book for the exam. Students not booked in advance will not be allowed to take the exam.

Students will have to demonstrate to be able to choose the most appropriate statistical methodology, to know its limitations and strenghts, and to be able to implement each technique and interpret the results.

Students who fail or withdraw from the exam may take it again in the same exam session.

ALESSIO FARCOMENI

ALESSIO FARCOMENI

ALESSIO FARCOMENI