A comprehensive catalog of predicted functional upstream open reading frames in humans
Host: Wu Yuechao
Date: 06 Apr 2022
Upstream open reading frames (uORFs) latent in mRNA transcripts are thought to modify translation of coding sequences by altering ribosome activity. Not all uORFs are thought to be active in such a pro- cess. To estimate the impact of uORFs on the regula- tion of translation in humans, we first circumscribed the universe of all possible uORFs based on cod- ing gene sequence motifs and identified 1.3 million unique uORFs. To determine which of these are likely to be biologically relevant, we built a simple Bayesian classifier using 89 attributes of uORFs labeled as ac- tive in ribosome profiling experiments. This allowed us to extrapolate to a comprehensive catalog of likely functional uORFs. We validated our predictions us- ing in vivo protein levels and ribosome occupancy from 46 individuals. This is a substantially larger cat- alog of functional uORFs than has previously been reported. Our ranked list of likely active uORFs al- lows researchers to test their hypotheses regarding the role of uORFs in health and disease. We demon- strate several examples of biological interest through the application of our catalog to somatic mutations in cancer and disease-associated germline variants in humans.