Motif-based models accurately predict cell type-specific distal regulatory elements

Cornejo-Páramo, Paola and Zhang, Xuan and Louis, Lithin and Li, Zelun and Yang, Yihua and Wong, Emily S. (2025) Motif-based models accurately predict cell type-specific distal regulatory elements. Nature Communications, 16 (1). ISSN 2041-1723

Full text not available from this repository.
Link to published document: https://doi.org/10.1038/s41467-025-65362-2

Abstract

Abstract

Deciphering how DNA sequence specifies cell-type-specific regulatory activity is a central challenge in gene regulation. We present Bag-of-Motifs (BOM), a computational framework that represents distal cis-regulatory elements as unordered counts of transcription factor (TF) motifs. This minimalist representation, combined with gradient-boosted trees, enables the accurate prediction of cell-type-specific enhancers across mouse, human, zebrafish, and Arabidopsis datasets. Despite its simplicity, BOM outperforms more complex deep-learning models while using fewer parameters. We validate BOM’s predictions experimentally by constructing synthetic enhancers from the most predictive motifs, demonstrating that these motif sets drive cell-type-specific expression. By providing direct interpretability and broad applicability, BOM reveals a highly predictive sequence code at distal regulatory regions and offers a scalable framework for dissecting cis-regulatory grammar across diverse species and conditions.

Item Type: Article
Subjects: R Medicine > R Medicine (General)
Depositing User: Repository Administrator
Date Deposited: 20 Dec 2025 04:05
Last Modified: 20 Dec 2025 04:05
URI: http://eprints.victorchang.edu.au/id/eprint/1778

Actions (login required)

View Item View Item