Login
Student authentication

Is it the first time you are entering this system?
Use the following link to activate your id and create your password.
»  Create / Recover Password

Program

Updated A.Y. 2021-2022

The course will discuss big data analysis for empirical exercises in economics, finance, and business. We will start with an overview of applications of big data analytics and data sources; further clarifying the main characteristics, advantages, and limitations of big data. Statistical and machine learning tools for supervised and unsupervised learning with big data will be detailed, including regularization methods for shrinkage and feature selection. A brief introduction to neural networks and deep neural networks will be also given. Finally we will discuss the main applications, with some more attention given to text mining. The R software for statistical computing will be used throughout.

Schedule of Topics

Characteristics of Big data. Advantages, limitations and opportunities. Sources of big data and motivating applications: web scraping, social media, Google. Architectures for big data collection, analysis, and storage. Working with a large sample size: sub-sampling, batching. Principles of prediction and tuning parameter choice.

Unsupervised Learning: k-means, PAM, trimmed k-means. Supervised learning: regularization and feature selection for linear and non-linear regression models. Ridge regression, LASSO, elastic net.

Machine learning approaches for supervised learning: k-nearest-neighbors, classification and regression trees, random forests. An overview of neural networks and deep learning. Transfer learning.

Images, sounds, text, as sources of information. Text mining: natural language processing, latent Dirichlet allocation, sentiment analysis.

Other topics might be mentioned, including market basket analysis and recommender systems.

Exam rules

The final evaluation is based on an oral examination. There will be no mid-term exam.
The exam will cover the entire program. It will be based on three
questions, one of which will be an impromptu practicum: the student will
have to perform a data analysis using the R software.


Students must book for the exam. Students not booked in advance will not be
allowed to take the exam. Final marks are uploaded on the
Delphi system so to be individually received by email by candidates.