Abstract
The information bottleneck framework provides a systematic approach to learning representations that compress nuisance information in the input and extract semantically meaningful information about predictions. However, the choice of a prior distribution that fixes the dimensionality across all the data can restrict the flexibility of this approach for learning robust representations. We present a novel sparsity-inducing spike-slab categorical prior that uses sparsity as a mechanism to provide the flexibility that allows each data point to learn its own dimension distribution. In addition, it provides a mechanism for learning a joint distribution of the latent variable and the sparsity, and hence it can account for the complete uncertainty in the latent space. Through a series of experiments using in-distribution and out-of-distribution learning scenarios on the MNIST, CIFAR-10, and ImageNet data, we show that the proposed approach improves accuracy and robustness compared to traditional fixed-dimensional priors, as well as other sparsity induction mechanisms for latent variable models proposed in the literature.
Original language | English |
---|---|
Pages (from-to) | 10207-10222 |
Number of pages | 16 |
Journal | Proceedings of Machine Learning Research |
Volume | 206 |
State | Published - 2023 |
Externally published | Yes |
Event | 26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023 - Valencia, Spain Duration: Apr 25 2023 → Apr 27 2023 |
Funding
Argonne National Laboratory’s work was supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, and Office of Advanced Scientific Computing Research, under contract DE-AC02-06CH11357. The work of Michigan State University was funded by the National Institutes of Health under grant RO3HG011674. This research used the computational resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357, Laboratory Computing Resource Center (LCRC) at the Argonne National Laboratory, and the Institute for Cyber-Enabled Research (ICER) at Michigan State University.