Morten Hannemose

Title: A history of generative models

Abstract: Generative models are everywhere, but where did they come from and where did they go?
This talk will cover selected topics such as active appearance models, autoencoders, variational autoencoders, generative adversarial networks, and finally diffusion models, giving you a taste of each method, and how they relate to one another.

Line Katrine Harder Clemmensen

Title 1: Data Representativity in ML/AI.

Abstract: Data representativity is crucial for the inference we make through our ML models. This talk investigates: How does the community define, use, and measure data representativity? We review papers from ICCV and NeurIPS, propose a set of metrics to help us assess data representativity and show how different representativity concepts meets different prediction and fairness goals.

Title 2: Low resource modelling

Jes Frellsen

Title: How to deal with missing data in unsupervised and supervised deep learning?

Abstract: Data with missing values is the norm in most real-world applications of deep learning. This tutorial aims to provide an overview of concepts and methods for effectively handling missing values within a statistical framework. We will explore techniques for addressing missing data in both unsupervised and supervised deep learning scenarios.

To begin, we will introduce the core concepts and methodologies employed in handling missing values. Next, we will discuss the intricacies of handling missing data within unsupervised deep learning. Specifically, we will examine how deep latent variable models can be leveraged to effectively learn from data with missing values.

Moving on, we will focus on strategies for adapting neural architectures to handle missing values in supervised deep learning. We will explore approaches that allow us to accommodate missing values during the training process. Specifically, we will investigate how deep latent variable models can enable us to marginalize over the missing values.

By the end of this tutorial, participants will gain an understanding of the core concepts and various techniques for handling missing data in unsupervised and supervised deep learning.

Related papers

Ghahramani Z, Jordan MI (1995). Learning from incomplete data. http://hdl.handle.net/1721.1/7202
Mattei P-A, Frellsen J (2019) MIWAE: Deep Generative Modelling and Imputation of Incomplete Data. ICML 2019, PMLR 97:4413-4423. https://proceedings.mlr.press/v97/mattei19a.html
Ipsen NB, Mattei P-A, Frellsen J (2022) How to deal with missing data in supervised deep learning? ICLR 2022. https://openreview.net/forum?id=J7b4BCtDm4

In Ghahramani and Jordan (1995), I recommend focusing on sections 1-2, which give a brief introduction to the statistical framework for handling missing data. For a complete introduction, I refer to Little and Rubin (2019) Statistical Analysis with Missing Data.

Kamal Nasrollahi

Title: Concept drift, labelling, and data generation

Abstract: In this talk we look at the largest public thermal dataset that we have annotated and released for the purpose of video analytics. The dataset has been annotated with about 6 million bounding boxes information of objects. We first show the effect of concept drift in this dataset, then we discuss the extensions of the annotation of the datasets from the mere bounding boxes to more behavioral information like anomalies. We also touch upon the effect of synthesizing data for this dataset.

Robert Jenssen

Talk 1

This talk will describe a new method for self-supervised representation learning in the medical image domain. The new method is used for content-based CT image retrieval and leverages a clinically relevant data augmentation technique. Furthermore, a new method for explaining representation learning is presented. This method is called Relax and is used to investigate the effect of the clinically relevant data augmentation.

Talk 2

This talk will study a scenario where we wish to learn joint representations and generative processes from multi-modal data. Our starting point is the realistic
scenario in which all modalities and class labels are available for model training,
but where some modalities and labels required for downstream tasks are missing. We introduce a novel conditional multi-modal discriminative model that uses
an informative prior distribution and optimizes a likelihood-free objective function
that maximizes mutual information between joint representations and missing
modalities.

Mads Nielsen

In this talks, I will present an outlook of the field in relation to the topics presented at the summer school including advanced image augmentation and its use in brain analysis.

Summer school missing data challenge

The exercises will be in form of a team challenge. Details can be found here https://github.com/RasmusRPaulsen/MissingDataChallenge

Please try to download the data for the challenge before arriving to the summer school.
You will be divided into teams at the start of the summer school.

Summer school on missing data, augmentation and generative models

14. – 18. August 2023

Talks, materials and challenge

Morten Hannemose

Related papers

Line Katrine Harder Clemmensen

Related papers

Jes Frellsen

Related papers

Kamal Nasrollahi

Related Papers

Robert Jenssen

Talk 1

Talk 2

Related papers

Mads Nielsen

Summer school missing data challenge