Global Sinkhorn Autoencoder — Optimal Transport on the latent representation of the full dataset

Authors
Adrián Csiszárik
 • 
Melinda F. Kiss
 • 
Balázs Maga
 • 
Ákos Matszangosz
 • 
Dániel Varga
Published at
Annales Univ. Sci. Budapest., Sect. Comp. 57 (2024) 101–115
Date
26 June 2024

We propose an Optimal Transport (OT)-based generative model from the Wasserstein Autoencoder (WAE) family of models, with the following innovative property: the optimization of the latent point positions takes place over the full training dataset rather than over a minibatch. Our contributions are the following:

  1. We define a new class of global Wasserstein Autoencoder models, and implement an Optimal Transport-based incarnation we call the Global Sinkhorn Autoencoder.
  2. We implement several metrics for evaluating such models, both in the unsupervised setting, and in a semi-supervised setting, which are the following: the global OT loss, which measures the OT loss on the full test dataset; the reconstruction error on the full test dataset; a so-called covered area which measures how well the latent points are matched; and two types of clustering measures.
  3. We demonstrate on specific complex prior distributions that global optimal transport improves the performance of generative models compared to minibatch-based baselines when evaluated by the previously listed metrics.