COMBAI computational biology and artificial intelligence

Human endogenous unannotated lncRNA (ulncRNA) network

This ulncRNA network was generated from our recent study " Distinctive functional regime of endogenous lncRNAs in dark regions of human genome"

More than 98% of the human genome is composed of noncoding regions and >90% of these noncoding regions are actively transcribed, suggesting their criticality in the human genome. Yet <1% of these regions have been functionally characterized, leaving most of the human genomes in the dark. Here, this study processes petabyte level data and systematically decodes endogenous lncRNAs located in unannotated regions of the human genome and deciphers a distinctive functional regime of lncRNAs hidden in massive RNAseq data. LncRNAs divergently distribute across chromosomes, independent of protein-coding regions. Their transcriptions rarely initiate on promoters through polymerase II, but rather partially on enhancers. Yet conventional enhancer markers (e.g. H3K4me1) only account for a small proportion of lncRNA transcriptions, suggesting alternatively unknown mechanisms initiating the majority of lncRNAs. Furthermore, lncRNA-self regulation also notably contributes to lncRNA activation. LncRNAs regulate broad bioprocesses, including transcription and RNA processing, cell cycle, respiration, response to stress, chromatin organization, post-translational modification, and development. Therefore, lncRNAs functionally govern their own regime distinctive from protein coding genes. This finding establishes a clear framework to comprehend human genome-wide lncRNA-lncRNA and lncRNA-protein coding gene regulations.

To search ulncRNAs and their regulatory network, please enter frequency score, absolute coef, max node number and genomic coordinate based on GRCh38.p10.V27.

Genomic coordinate format: chr:start-end. For example, chr3:36642256-36642555.For best network illustration, the max_length is limited to 10,000bp from start to end.

For clear illustration purposes, please also limit the max_node number (e.g. 50 to show top 50 most important nodes)

Network annotation: 1) Node color denotes gene category, lightGreen, blue, red respectively denotes protein_coding, annotated noncoding, uRNA. 2) Edge color represents regulation strength: red, pink, blue, lightSkyBlue, and lightGray respectively represents strong positive, middle positive, strong negative, middle negative and week regulation(positive or negative). 3)Edge thickness denotes confidence, thicker, more confident.

References

Anyou Wang. 2022. Distinctive functional regime of endogenous lncRNAs in dark regions of human genome.Computational and Structural Biotechnology Journal.CSBJ1530. https://doi.org/10.1016/j.csbj.2022.05.020 Wang, A. & Hai, R. FINET: Fast Inferring NETwork. BMC Res Notes 13, 521 (2020). https://doi.org/10.1186/s13104-020-05371-0