Lightnews — Scholar-powered news

Reposted by Nezar Abdennur

Jacob Schreiber @jmschreiber91.bsky.social · Aug 27

In the genomics community, we have focused pretty heavily on achieving state-of-the-art predictive performance.

While undoubtedly important, how we *use* these models after training is potentially even more important.

tangermeme v1.0.0 is out now. Hope you find it useful!

1 14 44

Nezar Abdennur @nvictus.bsky.social · Aug 25

My talk on #Composability in genomic software at #SciPy2025 is up on YouTube where I showcase both #anywidget and #oxbow.

Thank you to the organizers for the opportunity to present this to both computational biologists and the wider scientific computing community!

www.youtube.com/watch?v=G22_...

Nezar Abdennur - Accelerating Genomic Data Science and AI/ML with Composability | SciPy 2025

YouTube video by SciPy

www.youtube.com

1 2

Reposted by Nezar Abdennur

trevor manz @manzt.sh · Aug 7

if interested in creating anywidgets of your own, our tutorial was finally shared to youtube:

www.youtube.com/watch?v=frEo...

Abdennur, Lekschas, & Manz - Bring your __repr__’s to life with anywidget | SciPy 2024

YouTube video by SciPy

www.youtube.com

2 3 7

Nezar Abdennur @nvictus.bsky.social · Aug 19

Our #anywidget tutorial from last year's #SciPy conf was uploaded to youtube! Check it out for a hands-on walkthrough to create your own web-based widgets.

trevor manz @manzt.sh · Aug 7

if interested in creating anywidgets of your own, our tutorial was finally shared to youtube:

www.youtube.com/watch?v=frEo...

Abdennur, Lekschas, & Manz - Bring your __repr__’s to life with anywidget | SciPy 2024

YouTube video by SciPy

www.youtube.com

1 1

Nezar Abdennur @nvictus.bsky.social · Aug 11

We anticipate that joint dimensionality reduction and projection will become a foundational norm for comparative and integrative analysis of long-range interaction profiles in Hi-C/3C+ data. e.g. existing methods for working with classic A/B vectors can be extended to joint higher-order embeddings.

Nezar Abdennur @nvictus.bsky.social · Aug 11

We jointly-hic to create an atlas of 89 human Hi-C samples, uncovering distinct patterns of nuclear architecture associated with heterochromatin composition and demonstrating how higher-order principal components capture missing information about gene expression and regulatory element activity.

1

Nezar Abdennur @nvictus.bsky.social · Aug 11

jointly-hic accomplishes this using mini-batch incremental PCA, allowing for joint decomposition of arbitrarily many contact matrices at any resolution with constant memory.

1

Nezar Abdennur @nvictus.bsky.social · Aug 11

Joint decomposition allows for robust and directly comparable low dimensional representations of arbitrarily many contact maps, providing insights into genome organization across diverse biological contexts, from different tissues to developmental stages.

1

Nezar Abdennur @nvictus.bsky.social · Aug 11

The classic A/B compartment track comes from matrix factorization of a contact matrix into eigenvectors or PCs. Done separately, each map is projected onto a different coordinate system. Comparing such vectors directly is problematic, especially if seeking info from **higher-order** components.

1 2

Nezar Abdennur @nvictus.bsky.social · Aug 11

We introduce a framework and Python toolkit (github.com/abdenlab/joi...) for analyzing compartmentalization and long-range interactions in chromosome conformation capture data.

GitHub - abdenlab/jointly-hic: Genomics research toolkit for jointly embedding Hi-C 3D chromatin contact matrices into the same vector space

Genomics research toolkit for jointly embedding Hi-C 3D chromatin contact matrices into the same vector space - abdenlab/jointly-hic

github.com

1

Nezar Abdennur @nvictus.bsky.social · Aug 11

We're excited to share our new preprint, "Joint decomposition of Hi-C maps reveals salient features of genome architecture across tissues and development", led by Thomas Reimonn. www.biorxiv.org/content/10.1...

Joint decomposition of Hi-C maps reveals salient features of genome architecture across tissues and development

The spatial organization of chromosomes in the nucleus is fundamental to cellular processes. Contact frequency maps from Hi-C and related chromosome conformation capture assays are increasingly availa...

www.biorxiv.org

1 10 23

Nezar Abdennur @nvictus.bsky.social · Jul 9

Yes, and more recently Zarr too academic.oup.com/gigascience/...

While oxbow makes legacy data more accessible, it is a good conduit to more general-purpose persistent storage.

Analysis-ready VCF at Biobank scale using Zarr

AbstractBackground. Variant Call Format (VCF) is the standard file format for interchanging genetic variation data and associated quality control metrics.

academic.oup.com

2

Reposted by Nezar Abdennur

Jacob Schreiber @jmschreiber91.bsky.social · May 13

A huge challenge I face when doing ML + genomics analysis is *friction*: the stupid error messages (wrong device!) and dumb implementation issues that snap you out of the zone. I wrote a vignette on how tangermeme has helped me reduce this friction:

tangermeme.readthedocs.io/en/latest/ho...

How To: Reduce Friction and Save Time with Tangermeme — tangermeme v0.1.0 documentation

tangermeme.readthedocs.io

1 3 14

Reposted by Nezar Abdennur

Jacob Schreiber @jmschreiber91.bsky.social · Jun 30

(4) bpnet-lite: Load official Chrom/BPNet models into PyTorch for downstream tangermeme integration. Improved command-line tools + docs. Still concerns about perf of models trained from scratch -- will be resolved next version!

github.com/jmschrei/bpn...

bsky.app/profile/jmsc...

1 1 2

Nezar Abdennur @nvictus.bsky.social · Jul 7

We’re excited and eager for feedback, so please give oxbow a try!

`pip install oxbow`

2

Nezar Abdennur @nvictus.bsky.social · Jul 7

I’m also excited to be presenting Oxbow as part of my talk on composability at the #SciPy2025 Conference on Wednesday! Hope to see some of you there.

cfp.scipy.org/scipy2025/ta...

Breaking the silo: composable bioinformatics through cross-disciplinary open standards SciPy 2025

The practice of data science in genomics and computational biology is fraught with friction. This is in large part because bioinformatic tools tend to be tightly coupled to file input/output. As a res...

cfp.scipy.org

2 3 9

Nezar Abdennur @nvictus.bsky.social · Jul 7

It also supports:

* Column projection and pushdown (parsing only the fields you need)
* Complex and nested field types (e.g. alignment tags, variant genotype call data, etc.)
* Genomic range-based queries via an index
* User-defined transports and file systems

1 2

Nezar Abdennur @nvictus.bsky.social · Jul 7

This update (v0.4.x) provides complete #ApacheArrow data models for 11 file formats and counting, including the GA4GH/htslib formats and UCSC’s BigWig/BigBed.

1 1

Nezar Abdennur @nvictus.bsky.social · Jul 7

We revamped the #rustlang backend and implemented a new "DataSource" API in #Python, which allows for streaming conventional #genomic files – in-memory, on-disk, or in the cloud – into the modern data tools you use regularly, including #Pandas, #Polars, #DuckDB, and #Dask.

1 1

Nezar Abdennur @nvictus.bsky.social · Jul 7

I'm proud to announce the latest release of 🧬 #Oxbow 🏹, with new features to make NGS data analysis more powerful, efficient, and "composable".

Learn more at: oxbow.readthedocs.io

2 12 23

Nezar Abdennur @nvictus.bsky.social · Jul 7

We’re excited and eager for feedback, so please give oxbow a try!

`pip install oxbow`

Nezar Abdennur @nvictus.bsky.social · Jul 7

I’m also excited to be presenting Oxbow as part of my talk on composability at the #SciPy2025 Conference on Wednesday! Hope to see some of you there.

cfp.scipy.org/scipy2025/ta...

Breaking the silo: composable bioinformatics through cross-disciplinary open standards SciPy 2025

The practice of data science in genomics and computational biology is fraught with friction. This is in large part because bioinformatic tools tend to be tightly coupled to file input/output. As a res...

cfp.scipy.org

1

Nezar Abdennur @nvictus.bsky.social · Jul 7

It also supports:

* Column projection and pushdown (parsing only the fields you need)
* Complex and nested field types (e.g. alignment tags, variant genotype call data, etc.)
* Genomic range-based queries via an index
* User-defined transports and file systems

1

Nezar Abdennur @nvictus.bsky.social · Jul 7

This update (v0.4.x) provides complete #ApacheArrow data models for 11 file formats and counting, including the GA4GH/htslib formats and UCSC’s BigWig/BigBed.

1

Nezar Abdennur @nvictus.bsky.social · Jul 7

We revamped the #rustlang backend and implemented a new "DataSource" API in #Python, which allows for streaming conventional #genomic files – in-memory, on-disk, or in the cloud – into the modern data tools you use regularly, including #Pandas, #Polars, #DuckDB, and #Dask.

1