Neil Thomas
I work at the intersection of AI and biology. I am currently at EvolutionaryScale building AI tools for protein design.
Previously, I was a Research Scientist at Google X. I completed my PhD
in Computer Science at UC Berkeley in 2022, advised by Professor Yun S. Song. Prior to that, I was an AI Resident at
Google X and a Software Engineer at 23andMe. I received my BS in
Engineering Mathematics and Statistics from UC Berkeley.
When I'm not being humbled by biology, I like to be humbled by a variety of hobbies. I like cooking
recipes from Alison Roman, climbing rocks, skiing, playing ultimate frisbee, cycling, playing piano,
watching comedy, and watering my plants.
email /
twitter /
github /
scholar /
linkedin
|
|
Research Highlights
My research focuses on learning meaningful representations of proteins, with the aim of enabling
applications in protein design, functional annotation, and structure prediction. Check out my thesis
talk "Browsing in the Library of Babel" for an
accessible introduction.
|
|
Simulating 500 million years of evolution with a language model
Thomas Hayes*, Roshan Rao*, Halil Akin*, Nicholas James Sofroniew*, Deniz Oktay*, Zeming Lin*, Robert Verkuil*, Vincent Quy Tran, Jonathan Deaton, Marius Wiggert, Rohil Badkundri, Irhum Shafkat, Jun Gong, Alexander Derry, Raul Santiago Molina, Neil Thomas, Yousuf Khan, Chetan Mishra, Carolyn Kim, Liam J. Bartie, Patrick D. Hsu, Tom Sercu, Salvatore Candido, Alexander Rives
bioRxiv, 2024
paper
/ code
/ tweetorial
/ talk
/ blog
A multi-modal generative protein model that reasons flexibly across protein structure, function, and sequence.
|
|
Engineering highly active and diverse nuclease enzymes by combining machine learning and ultra-high-throughput screening
Neil Thomas*, David Belanger*, Chenling Xu, Hanson Lee, Kathleen Hirano, Kosuke Iwai, Vanja Polic, Kendra Nyberg, Kevin Hoff, Lucas Frenz, Charlie Emrich, Jun W Kim, Mariya Chavarha, Abi Ramanan, Jeremy Agresti, Lucy J Colwell
bioRxiv, 2024
paper
/ code
/ talk
Designed thousands of highly active, diverse nuclease enzymes using neural network models trained on experimental screening data, outperforming a traditional in vitro directed evolution campaign.
|
|
Tuned Fitness Landscapes for Benchmarking Model-Guided Protein Design
Neil Thomas*, Atish Agarwala*, David Belanger, Yun S. Song, Lucy J. Colwell
bioRxiv, 2022
paper
/ code
/ tweetorial
Tunable, realistic, synthetic fitness landscapes for benchmarking protein design.
|
|
Interpreting Potts and Transformer Protein Models Through the Lens of Simplified Attention
Nicholas Bhattacharya*, Neil Thomas*, Roshan Rao, Justas Dauparas, Peter K. Koo, David Baker, Yun S. Song, Sergey Ovchinnikov
Pacific Symposium on Biocomputing, 2022
paper
/ code
/ tweetorial
/ talk
Introduces “factored attention,” a simplified attention layer that we use to compare and contrast Potts models and Transformers.
|
|
Evaluating Protein Transfer Learning with TAPE
Roshan Rao*, Nicholas Bhattacharya*, Neil Thomas*, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, Yun S. Song
Advances in Neural Information Processing Systems (Spotlight), 2019
paper
/ code
/ tweetorial
/ talk
/ podcast
/ blog
A suite of benchmarking tasks for protein language models.
|
|
Whole-genome sequencing reveals a complex African population demographic history and signatures of local adaptation
Shaohua Fan, Jeffrey P. Spence, ..., Neil Thomas, ... Yun S. Song, Sarah A. Tishkoff, et al.
Cell, 2023
paper
|
|
End-to-end learning of multiple sequence alignments with differentiable Smith-Waterman
Samantha Petti, Nicholas Bhattacharya, Roshan Rao, Justas Dauparas, Neil Thomas, Juannan Zhou, Alexander M. Rush, Peter K. Koo, Sergey Ovchinnikov
Bioinformatics, 2022
paper
|
|
Functional genomics of OCTN2 variants informs protein-specific variant effect predictor for Carnitine Transporter Deficiency
Megan L. Koleske, Gregory McInnes, Julia E. H. Brown, Neil Thomas, ... Yun S. Song, Russ B. Altman, Kathleen M. Giacomini, et al.
PNAS, 2022
paper
|
|
Minding the gaps: The importance of navigating holes in protein fitness landscapes
Neil Thomas, Lucy Colwell
Cell Systems (Preview), 2021
paper
|
Teaching
During my graduate studies at Berkeley I had the privilege of teaching:
- Summer 2022 CS 188: Introduction to Artificial Intelligence
- Fall 2020 Stat 135: Concepts of Statistics
My teaching statement.
|
|