PhD project: Understanding dissolved organic matter (DOM) by destroying it: Combining metabolomics and bioinformatics to overcome the chimeric nature of DOM (International Max Planck Research School, Jena, Germany)
International Max Planck Research School for Globald Biogeochemical Cycles
In cooperation with the Friedrich Schiller University Jena, the Max Planck Institute for Biogeochemistry houses a unique and flexible research program that grants German and foreign students a broad selection of learning opportunities while still maintaining a research focus.
The IMPRS-gBGC offers a PhD program specializing in global biogeochemistry and related Earth system sciences. The overall research and teaching focuses on:
- Improved understanding of biogeochemical processes with an emphasis on terrestrial ecosystems
- Development of observational techniques to monitor and assess biogeochemical feedbacks in the Earth system
- Theory and model development for improving the representation of biogeochemical processes in comprehensive Earth system models
Earth and Space Science Informatics (ESSI)
Geosciences Instrumentation and Data Systems (GI)
This project focuses on the information content of complex mixtures, in particular dissolved organic matter (DOM), and how it can be properly revealed by modern mass spectrometry and bioinformatic tools. DOM is one of the most ubiquitous and complex mixtures on earth, being composed of thousands of individual molecules (Hawkes et al., 2020; Roth et al., 2019; Smith et al., 2018). This makes it a perfect study object to benchmark new bioinformatics tools to deconvolute such mixtures. Modern mass spectrometry allows us to resolve finest details in complex samples, but reaches its limits in identification of the individual components (Hertkorn et al., 2008). This is due to the chimeric nature of DOM that persists even after chromatographic separation, and which hampers the acquisition of clean tandem (MS2) mass spectra for identification (Petras et al., 2017), for example by fragmentation trees (Dührkop et al., 2015). Hence, the structural causes of chemical differences between DOM types remain elusive, which limits our understanding of the information that DOM inherits from metabolic processes at different ecosystem scales (microbial community, soil profile, landscape, watershed). New approaches to deconvolute or aggregate chimeric MS2 data are thus instrumental to spark progress in reaching full metabolome annotation in complex DOM samples (Dührkop et al., 2020; Rogers et al., 2019).
Research aim & questions
The overall research aim of the PhD project is to analyze a representative set of DOM samples in full detail by ultrahigh resolution (Orbitrap) MS2 analysis and to apply and advance existing bioinformatics approaches for the optimal analysis of the MS2 datasets (Aron et al., 2020; Dührkop et al., 2015, 2020; Rogers et al., 2019). Depending on the successful candidates’ qualifications and development, the project may head in the metabolomics or bioinformatics direction.
The metabolomics project centers around the following questions:
- Which characteristic MS2 features (mass differences, fragment ions) are common to different DOM samples, and which ones discern them?
- Which structural classes do these features represent?
- Which bioinformatics approach is suited best for analyzing these patterns?
- How do synthetic mixtures of known compounds compare to natural mixtures, and what does that imply for the deconvolution of chimeric MS2 data?
The bioinformatics project centers around computation method development for the analysis of DOM MS2 spectra:
- Can we compute fragmentation trees even if the MS2 spectrum represents a mixture of several isobaric or even isomeric compounds?
- Do we rediscover mass spectral motifs in the DOM data (van der Hooft et al., 2016) which allow to decompose the chimeric spectra?
- Can machine learning techniques correctly identify and predict compound classes in such mixtures?
Depending on the successful candidate’s qualifications (chemistry or informatics) work will focus more on MS2 data acquisition and biogeochemical interpretation (chemistry focus) or development and optimization of computational routines for the analysis of complex MS2 datasets (informatics focus). The MS2 data will be acquired with an ultrahigh resolution Orbitrap Elite mass spectrometer that allows both direct injection and LC analyses (Simon et al., 2018). Biogeochemical analysis will encompass the use of self-written routines and existing tools like GNPS (Aron et al., 2020; Petras et al., 2017), SIRIUS (Dührkop et al., 2015), or CANOPUS (Dührkop et al., 2020). Computational development of novel bioinformatics pipelines will deploy extensive use of machine learning techniques to decompose MS2 information from mixtures of knowns and unknowns. Both research foci will allow to derive novel insights into complex mixture information content in terms of indicative mass differences, their diversity, and potential uses for deconvolution of structural substance classes in complex samples.
Affiliation and support
The PhD candidate will be affiliated to the Chair of Bioinformatics at the Institute for Informatics at the FSU Jena and in the working group Molecular Biogeochemistry at MPI-BGC. Supervision at the FSU Jena is provided by Prof. Dr. Sebastian Böcker, and by Prof. Dr. Gerd Gleixner at MPI-BGC. Additional support will be provided by Kai Dührkop (bioinformatics, especially SIRIUS and CANOPUS), Carsten Simon (DOM analysis, Orbitrap), and Daniel Petras (DOM analysis, LC-MS2 with GNPS).
Applications to the IMPRS-gBGC are open to well-motivated and highly-qualified students from all countries. For this particular PhD project we seek a candidate either with qualifications in the field of metabolomics or bioinformatics.
For the metabolomics focus, we search a candidate with
- a Master’s degree in Chemistry, Biochemistry or other chemistry related sciences,
- experience in analytical chemistry, LC-MS, and handling of big data sets,
- of advantage is experience in high resolution MS data analysis (FT-ICR-MS or Orbitrap)
- very good oral and written communication skills in English
For the bioinformatics focus, we search a candidate with
- a Master’s degree in Bioinformatics, Informatics, or other informatics-related sciences,
- experience in programming and the use of LC-MS tools such as SIRIUS, GNPS, or CANOPUS
- of advantage is experience in machine learning techniques and small molecule identification
- very good oral and written communication skills in English
The Max Planck Society seeks to increase the number of women in those areas where they are underrepresented and therefore explicitly encourages women to apply. The Max Planck Society is committed to increasing the number of individuals with disabilities in its workforce and therefore encourages applications from such qualified individuals.
Application deadline for the fully funded PhD positions is August 23, 2021.
Your application consists of three steps:
- Online registration & submission of application documents (June 30 – August 23, 2021)
- (Possibly) Phone or video conference interview (until September 10, 2021),
- Recruitment event in Jena (October 13-15, 2021)
Find out more and apply online: www.imprs-gbgc.de