Daniel Falbel
dfalbel.bsky.social
Daniel Falbel
@dfalbel.bsky.social
850 followers 420 following 24 posts
Posts Media Videos Starter Packs
With a nice latent space vis ;)
luz v0.5.1 is now on #rstats CRAN. Just a small bug fix related to forwarding `predict` parameters to the model. I also added a new Variational Autoencoder example to our examples gallery: mlverse.github.io/luz/articles...
Examples
mlverse.github.io
It has many limitations:

- no dynamic shapes
- hardcoded weights

But it should be possible to greatly improve this in the future.
Where `resnet18_stablehlo.mlir` is a model exported with:
Here's a small example:
The motivation for it comes from working on inegrating
Apple embedding atlas into ragnar: github.com/tidyverse/ra...

We ended up using the python package, but we could have used a #rstats / JS only solution if we had some nice mosaic integration.
Embedding Atlas by dfalbel · Pull Request #124 · tidyverse/ragnar
Add support for visualizing the store using embedding-atlas
github.com
Just made a nice website for mosaicr. A small #rstats htmlwidget package that allows using
@idl.uw.edu Mosaic data visualization framework from R.

The real power comes when embedding Mosaic plots within Shiny apps. See examples and docs in

dfalbel.github.io/mosaicr/
Scalable, Interactive Data Visualization
Produce scalable, interactive data visualization using the Mosaic framework.
dfalbel.github.io
Yep! We have some work in github.com/r-xla
It's a long term project though. Theoretically it's already possible to create a graph in jax , export it to stablehlo and ecute it in R with no python dep.
r-xla
r-xla has 6 repositories available. Follow their code on GitHub.
github.com
tok is back on CRAN! tok is @hf.co tokenizers for #rstats. It uses the same Rust libray as the python interface. It's pretty fast and fully compatible with tokenizers.json available on hub.

See the release notes:

github.com/mlverse/tok/...
Release v0.2.0 · mlverse/tok
Updated upstream tokenizers to 0.20.3 Update extendr-api to 0.8.1
github.com
Reposted by Daniel Falbel
Dev 📦 alert! {lang} translates R help on-the-fly using your local LLM! It also overrides the `?` so you can easily access the translated docs and have them displayed on your IDE's help pane github.com/mlverse/lang #rstats #llm #ollama
GitHub - mlverse/lang: Uses LLMs to translate R help docs on the fly
Uses LLMs to translate R help docs on the fly. Contribute to mlverse/lang development by creating an account on GitHub.
github.com
Reposted by Daniel Falbel
Excited and grateful that R-Universe is R Consortium's newest top-level project! This means sustained support for @rOpenSci.hachyderm.io.ap.brid.gy's platform for discovery and publishing of #rstats packages. Hats off to @jeroenooms.bsky.social for his leadership!

ropensci.org/blog/2024/12...
R-Universe Named R Consortium Top-Level Project
We're excited to announce R-Universe has been named the R-Consortium's newest Top-Level Project.
ropensci.org
Happy to help making it work on MPS. It should just work if you create a tensor on the MPS device, using eg:

torch_randn(10,10, device="mps")
torch for #rstats has reached 500 GitHub stars :)
Reposted by Daniel Falbel
📦 usethis 3.1.0 📦 is released. `use_vignette()` and `use_article()` can now help you initiate a Quarto (.qmd) vignette or article. #rstats

usethis.r-lib.org/news/index.h...
Changelog
usethis.r-lib.org
This is only happens on Linux when using the CPU. But many workflows involve transforming data on the CPU before executing the model on the GPU, so this fix should improve performance of most torch programs running on Linux!
But LibTorch is really optimized to work with Intel MKL, and indeed statically links to a version of MKL for faster math kernels, but we were not using it leaving a lot of performance on the table.
The problem was that previous versions of torch would use whatever BLAS library R is configured to use (which in most system will be a single threaded BLAS, and if you had some fun configuring R, it could be OpenBLAS.
Nice! Wrapping the bucket with a CDN, such as CloudFront, should also make downloads faster, due to regional caching and other optimizations that CDN's do. It also allows you to decouple the storage location from how things are distributed, so it gives you freedom to change storage structures later.
I know nothing about nix, but how do folks download from the cache later? In general using a CloudFront distribution instead of directly accessing the bucket will reduce a lot the egress costs.
After long investigation, I still don't understand what is really happening and what's wrong, but... The fix turned out being quite simple. Just had to add the RTLD_DEEPBIND flag when opening the dynamic library 🤷‍♂️

github.com/mlverse/torc...