Neil Thomas

I work at the intersection of AI and biology. I am currently at EvolutionaryScale building AI tools for protein design.

Previously, I was a Research Scientist at Google X. I completed my PhD in Computer Science at UC Berkeley in 2022, advised by Professor Yun S. Song. Prior to that, I was an AI Resident at Google X and a Software Engineer at 23andMe. I received my BS in Engineering Mathematics and Statistics from UC Berkeley.

When I'm not being humbled by biology, I like to be humbled by a variety of hobbies. I like cooking recipes from Alison Roman, climbing rocks, skiing, playing ultimate frisbee, cycling, playing piano, watching comedy, and watering my plants.

email  /  twitter  /  github  /  scholar  /  linkedin

profile photo

Research Highlights

My research focuses on learning meaningful representations of proteins, with the aim of enabling applications in protein design, functional annotation, and structure prediction. Check out my thesis talk "Browsing in the Library of Babel" for an accessible introduction.

project image

Simulating 500 million years of evolution with a language model


Thomas Hayes*, Roshan Rao*, Halil Akin*, Nicholas James Sofroniew*, Deniz Oktay*, Zeming Lin*, Robert Verkuil*, Vincent Quy Tran, Jonathan Deaton, Marius Wiggert, Rohil Badkundri, Irhum Shafkat, Jun Gong, Alexander Derry, Raul Santiago Molina, Neil Thomas, Yousuf Khan, Chetan Mishra, Carolyn Kim, Liam J. Bartie, Patrick D. Hsu, Tom Sercu, Salvatore Candido, Alexander Rives
bioRxiv, 2024
paper / code / tweetorial / talk / blog

A multi-modal generative protein model that reasons flexibly across protein structure, function, and sequence.

project image

Engineering highly active and diverse nuclease enzymes by combining machine learning and ultra-high-throughput screening


Neil Thomas*, David Belanger*, Chenling Xu, Hanson Lee, Kathleen Hirano, Kosuke Iwai, Vanja Polic, Kendra Nyberg, Kevin Hoff, Lucas Frenz, Charlie Emrich, Jun W Kim, Mariya Chavarha, Abi Ramanan, Jeremy Agresti, Lucy J Colwell
bioRxiv, 2024
paper / code / talk

Designed thousands of highly active, diverse nuclease enzymes using neural network models trained on experimental screening data, outperforming a traditional in vitro directed evolution campaign.

project image

Tuned Fitness Landscapes for Benchmarking Model-Guided Protein Design


Neil Thomas*, Atish Agarwala*, David Belanger, Yun S. Song, Lucy J. Colwell
bioRxiv, 2022
paper / code / tweetorial

Tunable, realistic, synthetic fitness landscapes for benchmarking protein design.

project image

Interpreting Potts and Transformer Protein Models Through the Lens of Simplified Attention


Nicholas Bhattacharya*, Neil Thomas*, Roshan Rao, Justas Dauparas, Peter K. Koo, David Baker, Yun S. Song, Sergey Ovchinnikov
Pacific Symposium on Biocomputing, 2022
paper / code / tweetorial / talk

Introduces “factored attention,” a simplified attention layer that we use to compare and contrast Potts models and Transformers.

project image

Evaluating Protein Transfer Learning with TAPE


Roshan Rao*, Nicholas Bhattacharya*, Neil Thomas*, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, Yun S. Song
Advances in Neural Information Processing Systems (Spotlight), 2019
paper / code / tweetorial / talk / podcast / blog

A suite of benchmarking tasks for protein language models.

Research

For an up-to-date list, see Google Scholar

project image

Whole-genome sequencing reveals a complex African population demographic history and signatures of local adaptation


Shaohua Fan, Jeffrey P. Spence, ..., Neil Thomas, ... Yun S. Song, Sarah A. Tishkoff, et al.
Cell, 2023
paper

project image

End-to-end learning of multiple sequence alignments with differentiable Smith-Waterman


Samantha Petti, Nicholas Bhattacharya, Roshan Rao, Justas Dauparas, Neil Thomas, Juannan Zhou, Alexander M. Rush, Peter K. Koo, Sergey Ovchinnikov
Bioinformatics, 2022
paper

project image

Functional genomics of OCTN2 variants informs protein-specific variant effect predictor for Carnitine Transporter Deficiency


Megan L. Koleske, Gregory McInnes, Julia E. H. Brown, Neil Thomas, ... Yun S. Song, Russ B. Altman, Kathleen M. Giacomini, et al.
PNAS, 2022
paper

project image

Minding the gaps: The importance of navigating holes in protein fitness landscapes


Neil Thomas, Lucy Colwell
Cell Systems (Preview), 2021
paper


Teaching

During my graduate studies at Berkeley I had the privilege of teaching:

  • Summer 2022 CS 188: Introduction to Artificial Intelligence
  • Fall 2020 Stat 135: Concepts of Statistics

My teaching statement.


Built on Leonid Keselman's Jekyll fork of Jon Barron's website