Noam Teyssier
@noamteyssier.bsky.social
140 followers 97 following 57 posts
Bioinformatics Scientist at the Arc Institute. Working at the intersection of functional genomics, systems biology, and network dynamics. I also build rusty bioinformatics tools https://github.com/noamteyssier
Posts Media Videos Starter Packs
The workspace publishing has been such a hassle. So glad to see this out
Sounds great! Would be very interested in that and happy to help build one
bsky.app/profile/noam...

Here was a benchmark I ran a while back comparing twobit and binseq on a single-thread
Ah yes 2bit was a big inspiration for binseq - I didn't include it because it wasn't widely used and it was more geared towards large genomes so I figured it wouldn't scale.

But you're right I didn't formally test it. Here's a simple bench with Kent's utils (1-core bqtools to be fair)
2bit was built for genomes where there are very long contiguous N-blocks. the overhead for managing these blocks though on fastq-style records (generally very short and non-contiguous Ns) is massive and most of the time unnecessary.
Reposted by Noam Teyssier
Paraseq 0.4 is out now! With double the throughput for processing paired-end input :)

github.com/noamteyssier...
Added a feature to bqtools yesterday for colored grep output. Also supports colored FASTX output as well. Already useful this morning as I troubleshoot some sequencing outputs!
Writing in rust again after a long stretch of python is such a breath of fresh air.
Are you going to have a remote component to this? Would love to watch some of these talks if I can
Ah this is the way that I do it in paraseq! Doesn't work for fastq headers but works well for fasta
Reposted by Noam Teyssier
Introducing Arc Institute’s first virtual cell model: STATE
Reposted by Noam Teyssier
Preprint on "Improving spliced alignment by modeling splice sites with deep learning". It describes minisplice for modeling splice signals. Minimap2 and miniprot now optionally use the predicted scores to improve spliced alignment.
arxiv.org/abs/2506.12986
R.I.P your email inbox haha
Reposted by Noam Teyssier
New preprint! Deacon is a versatile tool for filtering FASTA/FASTQ files and streams at hundreds of megabases per second using minimizers, built with rapid metagenomic host depletion in mind, but equally useful for search.
github.com/bede/deacon
Reposted by Noam Teyssier
ish is a grep-like CLI tool that uses optimal alignment instead of exact matching.

It’s record-type aware, supporting line, FASTA, and FASTQ records.

Built in Mojo as a proof of concept for bioinformatics.

🧵1/5
Ish: SIMD and GPU Accelerated Local and Semi-Global Alignment as a CLI Filtering Tool https://www.biorxiv.org/content/10.1101/2025.06.04.657890v1
A good workaround for defaults I use sometimes is Bon. Adds to compile times though which can be annoying

bon-rs.com
Bon
Next-gen compile-time-checked builder generator, named function's arguments, and more!
bon-rs.com
lol what expires in this? It’s like pure metal
Reposted by Noam Teyssier
Slides from my talk (with @kamilsjaron.bsky.social) on an history of k-mers in bioinformatics: rayan.chikhi.name/pdf/2025-kme...
Love seeing audio stuff in rust. How’d you make the visualization?
Reposted by Noam Teyssier
📜 Excited to share insights from our recent paper: "Kaminari: a resource-frugal index for approximate colored k-mer queries". The study aims to efficiently identify documents containing a query string, focusing on DNA strings. www.biorxiv.org/content/10.1... 🧬 🖥️ 1/8
One of the great success stories of change haha
I think the best way to spur change is to make the new solution as easy as the old one. If it's an easy swap then I think its people will try it out and convince themselves its worth it.

Like swapping out std::collections::HashMap for hashbrown::HashMap.

But its easier said than done