Caleb Lareau
banner
caleblareau.bsky.social
Caleb Lareau
@caleblareau.bsky.social
assistant professor @mskcancercenter.bsky.social

clareaulab.com
I had an unbelievable first time running the NYC Marathon with
@fredsteammskcc.bsky.social and
@mskcancercenter.bsky.social! Shout out to the lab for supporting us!
November 3, 2025 at 3:45 PM
Typical day in lab honestly
October 31, 2025 at 4:47 PM
The squad 🍌🍌🍌
October 31, 2025 at 4:47 PM
We have some BIG PLANS TONIGHT!!!
with @ronanchaligne.bsky.social @mskcancercenter.bsky.social
October 31, 2025 at 4:47 PM
October 18, 2025 at 5:09 PM
Most importantly, I gotta give a shout out to my parents and my better half @isabelfulcher.bsky.social for crewing the entire spectacle!!

My first ultra in ~9 years… Couldn’t have done it without you! Don’s fries hit different after mile 80 lol
August 5, 2025 at 11:50 AM
Had a blast this weekend at the Badger Trail Races— super ULTRA mega congrats to @colehoneycutt.bsky.social and @paulklauser.bsky.social on their first 100M finishes!!!!

Together, we ran 303.3 miles in under 75 hours, burning 50,000+ calories along the way!
August 5, 2025 at 11:50 AM
Ultimately, our data reflect a clear pattern-- the outcomes of infections are highly heterogeneous, and for herpesviruses, the persistence of the viral genome is driven by MHC II variation. 14/
July 22, 2025 at 9:59 PM
Borrowing from the immunotherapy literature, we then scored each individual based on their predicted best peptide to derive a per-allele, per-person score of EBV presentation ability. This measure was strongly predictive for class II alleles and reproducible in both cohorts! 11/
July 22, 2025 at 9:59 PM
HLA was by far and away the strongest association, so we sought to dig in deeper. In our GWAS, we had the distinct advantage of knowing what peptides were presented on class I&II (i.e., the EBV proteome)--we scored all participants for predicted EBV peptide presentation strength via NetMHCpan. 10/
July 22, 2025 at 9:59 PM
Overall, this genetic architecture of EBV DNAemia covaried with various autoimmune conditions, including SLE & RA (known EBV-linked conditions) with an opposite effect for IBD. This demonstrates that GWAS signal for these autoimmune traits is pleiotropic with EBV DNAemia. 9/
July 22, 2025 at 9:59 PM
Next, we used EBV DNAemia as an outcome in a GWAS, identifying 20+ independent loci near many notable immune-associated genes such as HLA, CTLA4, EOMES, etc. The effects were strongly concordant between UKB and All of Us, a replication cohort. 8/
July 22, 2025 at 9:59 PM
With EBV DNAemia as a biomarker, we then performed a phenome-wide association study, looking to identify what complex traits are linked to persistence of the viral infection. Many known and novel hits were significant, from respiratory disease to autoimmunity! 7/
July 22, 2025 at 9:59 PM
By removing these regions, our measure of EBV DNAemia (i.e., persistent DNA in peripheral blood via WGS) could be derived for ~750,000 individuals. For a subset of the UKB where serology was available, our corrected measure was highly concordant only after masking these biased regions. 6/
July 22, 2025 at 9:59 PM
The workflow wasn't as straightforward as we originally thought. Due to highly repetitive regions of the EBV contig, low-quality alignments gave a pervasive false-positive signal, which we were able to ultimately mitigate. 5/
July 22, 2025 at 9:59 PM
In a cool twist of fate, the EBV contig is in hg38 to mop up unscrupulous EBV reads, a by-product of immortalizing lymphoblastic cell lines (used in 1000 Genomes Project, etc.). Hence, a simple `samtools view` could get a measure of persistent EBV DNA in large WGS cohorts, e.g., UK Biobank. 4/
July 22, 2025 at 9:59 PM
In this work, we focus on Epstein-Barr virus-- an enigma in many ways. While >90% of us have been infected by this endemic virus, most of us will live just fine. However, for a subset of the population, this infection can be a trigger of disease-- from MS to cancer to autoimmunity. 2/
July 22, 2025 at 9:59 PM
Excited to share a new preprint from the lab with @ryandhindsa.bsky.social ! www.biorxiv.org/content/10.1...

Led by @sherrynyeo.bsky.social, @erinmayc.bsky.social, and friends, we continue our journey to find viral DNA in our favorite place-- the overlooked and discarded reads in existing data! 1/
July 22, 2025 at 9:59 PM
PIPseqV uses this "random enzymatic fragmentation sites" to create "intrinsic molecular identifiers". So I think a likely cause isi the barcoding of different parts of the molecule that can be detected vs the strong 3' preference in the 10x workflow.
March 14, 2025 at 2:08 PM
To give a flavor for this, here are a few genes almost selectively detected in one technology but not the other. This is the interesting/scary part-- I'm not sure if these genes do anything in immune cells, but we've been blind to them in single-cell and/or there's some false-positive signal /9
March 14, 2025 at 1:29 PM
As one would expect, there's a strong correlation in pseudobulk gene expression (global r = 0.90), but what surprised me was many genes were detected almost exclusively in one technology but not the other /8
March 14, 2025 at 1:29 PM
The differences start to emerge when you look a bit more closely. In terms of molecular complexity, PIPseq performed pretty well, but there's a pretty massive difference compared to GEM-X. At similar cell numbers and sequencing depth, we get about 63% more UMIs / cell with GEM-X over PIPseq V /7
March 14, 2025 at 1:29 PM
Labeling cell types with azimuth, the correspondence between the technologies at this resolution is pretty clear. I dug into the slight differences between the two and couldn't find anything that I trusted (e.g., overclustering / cell quality / mislabels explained anything off the y=x line here) /6
March 14, 2025 at 1:29 PM
So what did we see? In brief, both technologies readily delineate the major cell types in PBMCs with clear expression of marker genes we've come to know and love. So at birdseye view, cell state separation and annotation is pretty consistent between technologies. /5
March 14, 2025 at 1:29 PM
We designed a basic head-to-head experiment using a cryovial of PBMCs from a healthy donor, split and profiled via 10x GEM-X 3' and PIPseq V (T20). @karolisk.bsky.social masterfully generated data on his first try with the Single-cell Analytics Innovation Lab (SAIL) here at MSKCC. /3
March 14, 2025 at 1:29 PM