Entropic Latent Variable Discovery. (arXiv:1807.10399v1 [stat.ML])

We consider the problem of discovering the simplest latent variable that can
make two observed discrete variables conditionally independent. This problem
has appeared in the literature as probabilistic latent semantic analysis
(pLSA), and has connections to non-negative matrix factorization. When the
simplicity of the variable is measured through its cardinality, we show that a
solution to this latent variable discovery problem can be used to distinguish
direct causal relations from spurious correlations among almost all joint
distributions on simple causal graphs with two observed variables. Conjecturing
a similar identifiability result holds with Shannon entropy, we study a loss
function that trades-off between entropy of the latent variable and the
conditional mutual information of the observed variables. We then propose a
latent variable discovery algorithm — LatentSearch — and show that its
stationary points are the stationary points of our loss function. We
experimentally show that LatentSearch can indeed be used to distinguish direct
causal relations from spurious correlations.

Source link

Back to top button