William Hackett Successfully Defends
Congratulations to William Hackett on the successful defense of their thesis. "Improving Reproducibility and Standards in Quantitative N-Glycoproteomic Data" Great job Will!
To learn more about Will's thesis, read their abstract below:
ABSTRACT
More than half of all human proteins are glycosylated, making glycosylation one of the most abundant post-translational modifications in proteomics. N-glycosylation is a prevalent and diverse type of glycosylation with key roles in regulating systems such as protein folding and host-pathogen recognition; without proper understanding of the heterogeneities of N-glycosylation efforts to understand biological systems and efforts to combat the maladies that affect those systems will be hindered, knowingly and unknowingly. N-glycosylation is a semi-stochastic process governed by local chemistries and enzymatic availability, and it is regulated by end process evaluation making modeling infeasible. This drives glycoproteomics to rely on observational data from tandem mass spectrometry; mass spectrometry is a powerful tool that comes with logistical and technical limitations on the availability and compatibility of data. N-glyocopeptides can be identified in tandem mass spectrometry data, but this is with greater uncertainty than traditional proteomics for a variety of factors. This uncertainty propagates into the quantification of these molecules, generating interdependent datasets with small sample sizes and high missing value rates. N-glycans are inherently interrelated by the biosynthetic network that they’re processed in, and as a result they have a lot of shared information and chemical properties that make identification and quantification more difficult. While advances in N-glycoproteomics continue there is still a lot needed for true and reliable understanding of quantitative N-glycoproteomics. In order to make use of the existing data, an R-package called RAMZIS- Relative Assessment of m/z Identifications by Similarity- was developed. This toolkit focuses on data quality assessment and identifying broad differences between glycosylation sites. RAMZIS uses a series of permutation tests with a weighted Tanimoto similarity assessment, it provides researchers with information on their ability to use their data, the presence of outliers, the probable differentiability of glycosylation sites, and how to improve their future experimentations. Data Independent Acquisition (DIA) has enabled vast improvements in proteomic’s ability to quantify and identify proteins in complex samples, but these improvements cannot be directly applied to glycoproteomics. Glycoproteins are more heterogeneous than deglycosylated proteomic datasets and have lower overall signal, the latter compounding the issues made by the former. For glycoproteomics to make full use of the power of DIA and account for its idiosyncrasies, a large number of bioinformatic advancements need to be made in glycopeptide identification, validation, and quantification. To this effort, we developed a python package called GlyLine as a framework to assess glycoproteomic DIA data; it tracks coeluting product ions of identified glycopeptides, splitting the signal from shared product ions in order to produce MS2 level quantifications of the identified glycopeptides and provide databases of information for further analysis. As glycoproteomics advances and comes into greater prominence, it is vital that experiments and bioinformatic workflows be repeatable, as quantitative glycoproteomic data is reported in many different ways that are often incompatible. We have worked with the MIRAGE Commission in order to develop a community based minimum reporting guideline for glycoproteomic experiments.
Major Professor: Joseph Zaia
Comments