Founding Machine Learning Scientist - Molecular AI
Posted 2 weeks ago
Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
This role involves designing and advancing machine learning models to infer molecular structure and properties from mass spectrometry data. The scientist will lead the development of the next generation molecular foundation model, Gaia-02, extending spectrum-to-structure prediction into broader molecular reasoning and downstream applications.
Responsibilities
- Lead the development of the next generation of our molecular foundation model for mass spectrometry
- Design and train models for mass spectra to molecular structure inference
- Develop latent molecular representations from MS/MS and related data
- Extend structure predictions into downstream molecular reasoning (e.g., bioactivity, prioritization)
Requirements
- Experience developing and training machine learning models in PyTorch or similar frameworks
- Experience designing novel modeling approaches and implementing the latest methods from the literature
- Ability to independently scope and execute research problems involving large, high-dimensional datasets, including handling noise and distributional shifts
- Experience training models at scale (cloud or HPC environments)
- Strong software engineering skills in Python, including writing clean, well-structured, production-quality code
Nice to Have
- Experience with generative models (e.g., autoregressive, diffusion, flow) and geometric deep learning (e.g., GNNs, Deep sets, EGNNs)
- Experience working with molecular, chemical, or spectral datasets
- Familiarity with metabolomics or mass spectrometry workflows and computational models (e.g., MIST, DreaMS, ICEBERG)
Skills
* Required skills
About Novogaia
Novogaia is building computational systems to discover and develop small molecule medicines from fungi, leveraging advances in mass spectrometry and computation to systematically explore nature's chemical diversity at scale.