Distributionally Robust Learning of Shared Representations from Multi-Source Data

Dataemia
2 Min Read


View a PDF of the paper titled StablePCA: Distributionally Robust Learning of Shared Representations from Multi-Source Data, by Zhenyu Wang and 4 other authors

View PDF
HTML (experimental)

Abstract:When synthesizing multi-source high-dimensional data, a key objective is to extract low-dimensional representations that effectively approximate the original features across different sources. Such representations facilitate the discovery of transferable structures and help mitigate systematic biases such as batch effects. We introduce Stable Principal Component Analysis (StablePCA), a distributionally robust framework for constructing stable latent representations by maximizing the worst-case explained variance over multiple sources. A primary challenge in extending classical PCA to the multi-source setting lies in the nonconvex rank constraint, which renders the StablePCA formulation a nonconvex optimization problem. To overcome this challenge, we conduct a convex relaxation of StablePCA and develop an efficient Mirror-Prox algorithm to solve the relaxed problem, with global convergence guarantees. Since the relaxed problem generally differs from the original formulation, we further introduce a data-dependent certificate to assess how well the algorithm solves the original nonconvex problem and establish the condition under which the relaxation is tight. Finally, we explore alternative distributionally robust formulations of multi-source PCA based on different loss functions.

Submission history

From: Zhenyu Wang [view email]
[v1]
Fri, 2 May 2025 00:53:39 UTC (2,482 KB)
[v2]
Tue, 3 Mar 2026 15:28:24 UTC (7,789 KB)
[v3]
Sat, 7 Mar 2026 18:38:13 UTC (7,789 KB)



Source link

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

error: Content is protected !!