Willems et al. 2016 (PRJNA348553)

General Details

Title N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana.
Organism
Number of Samples 2
Release Date 2016/10/14 00:00
Sequencing Types
Protocol Details

Study Links

Repository Details

SRA SRP091588
ENA SRP091588
GEO
BioProject PRJNA348553

Publication

Title
Authors Willems P,Ndah E,Jonckheere V,Stael S,Sticker A,Martens L,Van Breusegem F,Gevaert K,Van Damme P
Journal Molecular & cellular proteomics : MCP
Publication Date 2017 Jun
Abstract Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
PMC PMC5461538
PMID 28432195
DOI
Run Accession Study Accession Scientific Name Cell Line Library Type Treatment GWIPS-viz Trips-Viz Reads BAM BigWig (F) BigWig (R)
SRR4424237 PRJNA348553 Arabidopsis thaliana Ribo-Seq Lactimidomycin
SRR4424238 PRJNA348553 Arabidopsis thaliana Ribo-Seq Cycloheximide
Run Accession Study Accession Scientific Name Cell Line Library Type Treatment GWIPS-viz Trips-Viz Reads BAM BigWig (F) BigWig (R)

ⓘ For more Information on the columns shown here see: About