Daan Speth
daanspeth.bsky.social
Daan Speth
@daanspeth.bsky.social
Computational microbiologist. Senior scientist at @cemess.bsky.social, @univie.ac.at.

Microbial ecology, mostly of nitrogen cycle microbes, and data driven physiology.

Maintainer of the GlobDB genome database https://globdb.org
Pinned
Our paper describing the GlobDB is now published in @bioinfoadv.bsky.social
doi.org/10.1093/bioa...

The GlobDB is the largest species dereplicated genome database currently available, containing 306,260 species representatives.
More information on globdb.org 1/5
🖥️🧬🦠
GlobDB: a comprehensive species-dereplicated microbial genome resource
AbstractMotivation. Over the past years, substantial numbers of microbial species’ genomes have been deposited outside of conventional INSDC databases.Resu
doi.org
Reposted by Daan Speth
Come join us! Soon I will be advertising a postdoc vacancy in my group as part of my @erc.europa.eu AdG project 'DARK ROOTS'. Focus of the project will be on phylo- and metagenomic mining of novel prokaryotic lineages. I will soon post a link here - stay tuned, and please repost! #asgardarchaea
a man walking in a field with an umbrella
ALT: a man walking in a field with an umbrella
media.tenor.com
November 26, 2025 at 3:39 PM
Reposted by Daan Speth
🧫 Just out in Bioinformatics Advances: “GlobDB: A comprehensive species-dereplicated microbial genome resource.” 

Explore the full study: https://doi.org/10.1093/bioadv/vbaf280
November 26, 2025 at 10:01 AM
Reposted by Daan Speth
AI model Helixer predicts eukaryotic genes ab initio, directly from a plain text FASTA file.

No RNA-seq.
No protein homology.
No repeats, hints, or curated evidence.

Raw genome → accurate gene models.
Deep learning + HMM, published in @natmethods.nature.com

www.nature.com/articles/s41...
Helixer: ab initio prediction of primary eukaryotic gene models combining deep learning and a hidden Markov model - Nature Methods
By leveraging both deep learning and hidden Markov models, Helixer achieves broad taxonomic coverage for ab initio gene annotation of eukaryotic genomes from fungi, plants, vertebrates and invertebrat...
www.nature.com
November 26, 2025 at 6:00 AM
November 26, 2025 at 8:45 AM
Reposted by Daan Speth
We are hiring! If you're interested in exploring the #biogeography of #prokaryotes using #genomics and #metagenomics, are interested in biology and geographic information systems, and are passionate about #OpenScience, this is for you! New PhD and PostDoc positions in my lab at Aalborg University
November 24, 2025 at 8:49 PM
Reposted by Daan Speth
Boltz-2: an AI model approaching free-energy perturbation accuracy for small-molecule–protein binding affinity!
It builds on Boltz-1, an open-source model from MIT that predicts the 3D structures of biomolecular complexes with AlphaFold3-level accuracy 🧪✨ 🧬 & 🖥️
www.biorxiv.org/content/10.1...
November 25, 2025 at 5:38 AM
Reposted by Daan Speth
I knew early on I wanted to work with computers, but because of dyslexia I ended up in a lower-tier German school. The career office said a tech job wasn’t realistic. I ignored that, took a convoluted path into university, discovered bioinformatics, got hooked on algorithms&proteins, and became a PI
What’s the lore behind choosing your career path ?
November 24, 2025 at 2:42 AM
Reposted by Daan Speth
📣 Open position: We are looking for a new #ScientificLead for the #SILVA database 🧬🖥️

You will be responsible for guiding the development of this important resource and for curating the SILVA taxonomy🌳and more!

👉 www.dsmz.de/dsmz/career/...

Read and share ‼️

#sciencejobs 🦠🧪 #career #taxonomy
November 22, 2025 at 3:36 PM
Reposted by Daan Speth
Just in time for me to cite in something I've been using GlobDB for
Our paper describing the GlobDB is now published in @bioinfoadv.bsky.social
doi.org/10.1093/bioa...

The GlobDB is the largest species dereplicated genome database currently available, containing 306,260 species representatives.
More information on globdb.org 1/5
🖥️🧬🦠
GlobDB: a comprehensive species-dereplicated microbial genome resource
AbstractMotivation. Over the past years, substantial numbers of microbial species’ genomes have been deposited outside of conventional INSDC databases.Resu
doi.org
November 22, 2025 at 9:32 AM
Reposted by Daan Speth
GlobDB is a great resource and is frequently of use in my research!

Developed and maintained by @daanspeth.bsky.social, I was glad to contribute in a small way. Have a look at Daan’s thread to find out more and let us know how you use it! >>
Our paper describing the GlobDB is now published in @bioinfoadv.bsky.social
doi.org/10.1093/bioa...

The GlobDB is the largest species dereplicated genome database currently available, containing 306,260 species representatives.
More information on globdb.org 1/5
🖥️🧬🦠
GlobDB: a comprehensive species-dereplicated microbial genome resource
AbstractMotivation. Over the past years, substantial numbers of microbial species’ genomes have been deposited outside of conventional INSDC databases.Resu
doi.org
November 21, 2025 at 4:35 PM
Reposted by Daan Speth
Congratulations @daanspeth.bsky.social and the whole team. GlobDB has already been very helpful for several projects at @cemess.bsky.social, and it will no doubt be a useful resource for many microbiologists around the world. Really glad to see it out there.
November 21, 2025 at 6:28 PM
Reposted by Daan Speth
Great news. I've been using GlobDB for tracking distribution of a class of replicators across taxonomies. Highly, highly recommended for people looking for dereplicated dataset.

🧬💻
Our paper describing the GlobDB is now published in @bioinfoadv.bsky.social
doi.org/10.1093/bioa...

The GlobDB is the largest species dereplicated genome database currently available, containing 306,260 species representatives.
More information on globdb.org 1/5
🖥️🧬🦠
GlobDB: a comprehensive species-dereplicated microbial genome resource
AbstractMotivation. Over the past years, substantial numbers of microbial species’ genomes have been deposited outside of conventional INSDC databases.Resu
doi.org
November 21, 2025 at 4:37 PM
Our paper describing the GlobDB is now published in @bioinfoadv.bsky.social
doi.org/10.1093/bioa...

The GlobDB is the largest species dereplicated genome database currently available, containing 306,260 species representatives.
More information on globdb.org 1/5
🖥️🧬🦠
GlobDB: a comprehensive species-dereplicated microbial genome resource
AbstractMotivation. Over the past years, substantial numbers of microbial species’ genomes have been deposited outside of conventional INSDC databases.Resu
doi.org
November 21, 2025 at 4:21 PM
Reposted by Daan Speth
There's a PhD position now available with me in Bath, on the evolution of symbiosis. www.findaphd.com/phds/project.... The supervisory team also includes @anja1.bsky.social @phil-donoghue.bsky.social and others. NB, this is open both to UK-based students *and* to international students :)
The genomic basis of symbiotic integration at University of Bath on FindAPhD.com
PhD Project - The genomic basis of symbiotic integration at University of Bath, listed on FindAPhD.com
www.findaphd.com
October 9, 2025 at 9:39 AM
Reposted by Daan Speth
🚀New preprint from our lab!
I am very excited to finally share what has been the main focus of my PhD for the past almost 3 years! It is about viral dark matter and a powerful tool we built to shed light on it. 🧬💡
Continue reading (🧵)
November 20, 2025 at 6:52 PM
Reposted by Daan Speth
We have a date for the free-to-attend #anvio workshop and ECR Symposium for 2026, and we look forward to meeting you at the @hifmb.de in Oldenburg, Germany!

Please find more information on the venue, program, and the application form here, and spread the word 😇

anvio.org/workshops/20...
November 20, 2025 at 6:42 PM
Reposted by Daan Speth
Junior professor position at the University of Kaiserslautern! Our Department of Biology is seeking a colleague working with microalgae. This is a wonderful place to do wonderful science with great people! Please, repost :)
Further details:
jobs.rptu.de/jobposting/1...
November 20, 2025 at 3:28 PM
Reposted by Daan Speth
🧬🖥️SILVA in 2026: a global core biodata resource for rRNA within the DSMZ digital diversity

📑The new publication about the #SILVA database for the #NAR database issue is now online.

👉 academic.oup.com/nar/article/...
SILVA in 2026: a global core biodata resource for rRNA within the DSMZ digital diversity
Abstract. Since 2007, the SILVA database (https://www.arb-silva.de/) has served as a comprehensive resource providing quality-checked, aligned, and classif
academic.oup.com
November 19, 2025 at 10:36 AM
Reposted by Daan Speth
🚨Job claxon 🚨

University College Cork is looking to appoint a lecturer in Medical Microbiology into a permanent, non-clinical post

A great opportunity in a microbiology powerhouse

For details go to my.corehr.com/pls/uccrecru... and enter reference number 092153
University College Cork Vacancies
my.corehr.com
November 18, 2025 at 10:55 PM
Reposted by Daan Speth
The University at Albany Biomedical Sciences Department is hiring at the Assistant Professor level. We are looking for researchers at the intersection of infectious disease and artificial intelligence:
albany.interviewexchange.com/jobofferdeta...
#infectiousdisease #artificialintelligence
albany.interviewexchange.com
November 18, 2025 at 10:45 PM
Reposted by Daan Speth
OK, #bioinformatics folk. We have some (many many) reads from a metagenome. They have been binned into a bacterial genome. They have no matches to any known genome in any database. They code for "bacterial" genes. What are good triple-checks to do to argue that they are not, in fact, euk sequence?
November 18, 2025 at 5:29 PM
Reposted by Daan Speth
📣 Workshop announcement: This year's #SILVA 🧬 online workshop will take place on 🗓️ 11th December, at 3pm CET.

The #SILVA team will inform about the upcoming release, the new website design 🖥️, and more!

Write to [email protected] to receive the Zoom login details

#science #FAIRdata 🧫🧪
November 17, 2025 at 2:51 PM
Reposted by Daan Speth
Clone-FISH paper out: Manuscript/resource alert #microsky 🦠 We present a collection of 30 E. coli (CloneFISH) cultures, each carrying a plasmid for the heterologous expression of a (near) full-length 16S rRNA gene from one of 30 lineages of archaea, including 17 yet uncultured ones.
November 17, 2025 at 2:36 PM