Global Sinkhorn Autoencoder — Optimal Transport on the latent representation of the full dataset
Authors
Published at
Annales Univ. Sci. Budapest., Sect. Comp. 57 (2024) 101–115
Date
26 June 2024
We propose an Optimal Transport (OT)-based generative model from the Wasserstein Autoencoder (WAE) family of models, with the following innovative property: the optimization of the latent point positions takes place over the full training dataset rather than over a minibatch. Our contributions are the following:
- We define a new class of global Wasserstein Autoencoder models, and implement an Optimal Transport-based incarnation we call the Global Sinkhorn Autoencoder.
- We implement several metrics for evaluating such models, both in the unsupervised setting, and in a semi-supervised setting, which are the following: the global OT loss, which measures the OT loss on the full test dataset; the reconstruction error on the full test dataset; a so-called covered area which measures how well the latent points are matched; and two types of clustering measures.
- We demonstrate on specific complex prior distributions that global optimal transport improves the performance of generative models compared to minibatch-based baselines when evaluated by the previously listed metrics.