Properties | |
---|---|
authors | Tim R. Davidson, Luca Falorsi, Nicola de Cao, Thomas Kipf, Jakub M. Tomczak |
year | 2018 |
url | https://arxiv.org/abs/1804.00891 |
Abstract
The Variational Auto-Encoder (VAE) is one of the most used unsupervised machine learning models. But although the default choice of a Gaussian distribution for both the prior and posterior represents a mathematically convenient distribution often leading to competitive results, we show that this parameterization fails to model data with a latent hyperspherical structure. To address this issue we propose using a von Mises-Fisher (vMF) distribution instead, leading to a hyperspherical latent space. Through a series of experiments we show how such a hyperspherical VAE, or -VAE, is more suitable for capturing data with a hyperspherical latent structure, while outperforming a normal, -VAE, in low dimensions on other data types. Code at this http URL and this https URL
Notes¶
- "However, even for m>20 we observe a vanishing surface problem (see Figure 6 in Appendix E). This could thus lead to unstable behavior of hyperspherical models in high dimensions."
- Basically, the hypesphere's surface area starts collapsing on high dimensions (m>20), which makes it unsuitable choice, as embeddings in this manifold lose discriminative power. This is backed by the paper's results, where s-vae outperforms n-vae up to d=40.