Predicting cross-tissue hormone-gene relations using balanced word embeddings

doi: 10.1093/bioinformatics/btac578.

Online ahead of print.


Item in Clipboard

Aditya Jadhav et al.





Inter-organ/inter-tissue communication is central to multi-cellular organisms including humans, and mapping inter-tissue interactions can advance system-level whole-body modeling efforts. Large volumes of biomedical literature have fostered studies that map within-tissue or tissue-agnostic interactions, but literature mining studies that infer inter-tissue relations such as between hormones and genes are solely missing.


We present a first study to predict from biomedical literature the hormone-gene associations mediating inter-tissue signaling in the human body. Our BioEmbedS* models use neural network based Biomedical word Embeddings with a Support Vector Machine classifier to predict if a hormone-gene pair is associated or not, and whether an associated gene is involved in the hormone’s production or response. Model training relies on our unified dataset HGv1 (Hormone-Gene version 1) of ground-truth associations between genes and endocrine hormones, which we compiled and carefully balanced in the embedded space to handle data disparities such as between poorly- vs. well-studied hormones. Our BioEmbedS model recapitulates known gene mediators of tissue-tissue signaling with 70.4% accuracy; predicts novel inter-tissue communication genes in humans which are enriched for hormone-related disorders; and generalizes well to mouse, thereby holding promise for its extension to other multi-cellular organisms as well.


Freely available at are our model predictions & datasets; has all relevant code.

Supplemental information:

Supplementary information available at Bioinformatics online.

Source link

Back to top button