Adam Auton
banner
adamauton.bsky.social
Adam Auton
@adamauton.bsky.social
Geneticist @ 23andMe
Interestingly, you can probe the internal embeddings of these models, and we found that the causal genes tend to be 'proximal' to the phenotypes that they influence in embedding space. So they do seem to be learning some relationship between these two concepts.
June 3, 2024 at 3:11 AM
Nonetheless, the LLMs also have biases; they tend to favor genes with lots of existing literature, which perhaps isn't surprising given how they're trained. They also struggle to identify causal genes in loci containing large numbers of genes.
June 3, 2024 at 3:10 AM
The answer appears to be yes! In fact, using a *really simple* approach, LLMs appear to outperform state-of-the-art methods at identifying the causal gene in a variety of 'gold standard' truth datasets!
June 3, 2024 at 3:10 AM
I went to a place
September 11, 2023 at 1:23 AM