Daan Speth
daanspeth.bsky.social
Daan Speth
@daanspeth.bsky.social
Computational microbiologist. Senior scientist at @cemess.bsky.social, @univie.ac.at.

Microbial ecology, mostly of nitrogen cycle microbes, and data driven physiology.

Maintainer of the GlobDB genome database https://globdb.org
thanks Ashish!
November 21, 2025 at 7:25 PM
I don’t have separate funding for the GlobDB, so i don’t think this is in the cards. Our cluster, where this is hosted, is separately funded and storage is available long term.
November 21, 2025 at 5:47 PM
Thanks Ryan. Let me know if there’s anything you’d like to see added, and i can see whether we can add it 👍
November 21, 2025 at 4:53 PM
Many thanks to the coathors (Nick Pullen, @aroneys.bsky.social, @bcoltman.bsky.social, Jay Osvatic, @benjwoodcroft.bsky.social, Thomas Rattei, and @michiwagner4.bsky.social) for helping me put this resource together.

I hope others find it as useful as I do! 5/5
November 21, 2025 at 4:22 PM
Finally, for taxonomic analyses, the GlobDB includes a full seven level taxonomy that is compatible with, and extends, the GTDB taxonomy. This taxonomy was also used to create sylph databases and a SingleM metapackage for taxonomic profiling of read datasets. 4/5
November 21, 2025 at 4:22 PM
For protein analyses, the GlobDB provides the amino acid fasta files for all genomes, as well as kegg/cog/pfam annotations, and a clustered dataset (40% id over 80% of both sequences) of ~80M proteins. For this clustered dataset, PLM embeddings are available. 3/5
November 21, 2025 at 4:22 PM
The GlobDB extends the GTDB (@ace-gtdb.bsky.social) by over 160,000 species representatives. In addition to the added diversity captured, we provide several analysis products for download.

For the genomes, there's anvi'o dbs, genome fasta, quality stats, and GFF files. 2/5
November 21, 2025 at 4:21 PM
The GlobDB extends the GTDB (@ace-gtdb.bsky.social) by over 160,000 species representatives. In addition to the added diversity captured, we provide several analysis products for download.

For the genomes, there's anvi'o dbs, genome fasta, quality stats, and GFF files. 2/5
November 21, 2025 at 4:20 PM