Skip to main content

Visualizing probabilistic models and data with Intensive Principal Component Analysis

Cornell Affiliated Author(s)

Author

K.N. Quinn
C.B. Clement
F. De Bernardis
M.D. Niemack
J.P. Sethna

Abstract

Unsupervised learning makes manifest the underlying structure of data without curated training and specific problem definitions. However, the inference of relationships between data points is frustrated by the “curse of dimensionality†in high dimensions. Inspired by replica theory from statistical mechanics, we consider replicas of the system to tune the dimensionality and take the limit as the number of replicas goes to zero. The result is intensive embedding, which not only is isometric (preserving local distances) but also allows global structure to be more transparently visualized. We develop the Intensive Principal Component Analysis (InPCA) and demonstrate clear improvements in visualizations of the Ising model of magnetic spins, a neural network, and the dark energy cold dark matter (ΛCDM) model as applied to the cosmic microwave background. © 2019 National Academy of Sciences. All rights reserved.

Date Published

Journal

Proceedings of the National Academy of Sciences of the United States of America

Volume

116

Issue

28

Number of Pages

13762-13767,

URL

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85068575652&doi=10.1073%2fpnas.1817218116&partnerID=40&md5=708913b5527651dc35ea9466c9970977

DOI

10.1073/pnas.1817218116

Group (Lab)

James Sethna Group

Funding Source

1719490
AST-1454881
DMR-1312160
DMR-1719490

Download citation