Mark Fairbanks
markfairbanks.bsky.social
Mark Fairbanks
@markfairbanks.bsky.social
Data Scientist/Engineer
Author of R’s tidytable
Co-author of R’s dtplyr
Reposted by Mark Fairbanks
I’ve been drafting a blog post on my recent Python tooling woes and the working title is:

“Python tooling is easy! Just use pip, conda, mamba, poetry, pipx, pdm, nix, uv, venv, virtualenv, pipenv or pyenv”
February 14, 2025 at 11:41 PM
Reposted by Mark Fairbanks
We are happy to share more about what we are building and our goal to run Polars on any dataset size!

A managed Distributed Polars compute cluster to ensure a single DataFrame API for all your needs.

pola.rs/posts/polars...
Polars Cloud; the distributed Cloud Architecture to run Polars anywhere
DataFrames for the new era
pola.rs
February 12, 2025 at 2:11 PM
Reposted by Mark Fairbanks
I tried converting some of my S3 code to S7.

What I loved:
• the automatic validation is phenomenal
• extending S3 with S7 is straightforward

What I’m struggling with:
• my custom constructors, particularly with child classes, are pretty complicated and involve lots of copy-pasting of code
November 29, 2024 at 11:20 PM
Reposted by Mark Fairbanks
A day late but…

extremely thankful for folks making awesome Python libraries so I can avoid learning Rust or C.
November 29, 2024 at 10:04 PM
Reposted by Mark Fairbanks
If you're interested in trying out LLMs in #rstats but don't know where to begin, I've added a few two vignettes to elmer: elmer.tidyverse.org/articles/elm... and elmer.tidyverse.org/articles/pro...
Getting started with elmer
elmer.tidyverse.org
November 29, 2024 at 3:45 PM
Reposted by Mark Fairbanks
You want to join two tables on their ID column, but only when the dates in one table fall within the range of the other table.

Polars lets you do that with `join_where`, which supports inequality joins through the use of inequality predicates.

Here's an example 👇
November 28, 2024 at 10:52 AM
Reposted by Mark Fairbanks
@rabaath.bsky.social explains why even after 5 years #python pandas feels clunky when coming from #Rstats: www.sumsar.net/blog/pandas-... My take is not on whether one is better than the other for the experienced, but on which one is more accessible for the inexperienced.
Why pandas feels clunky when coming from R
Five years ago I started a new role and I suddenly found myself, a staunch R fan, having to code in Python on a daily basis. Working with data, most of my Python work involved using pandas, the …
www.sumsar.net
November 19, 2024 at 10:25 AM
Reposted by Mark Fairbanks
Excited to join the Bluesky community! We're here to connect with everyone who uses Quarto to share their ideas. What have you been creating lately?

#DataScience #OpenScience #ReproducibleResearch #rstats #pydata #julialang
November 15, 2024 at 11:21 PM
Reposted by Mark Fairbanks
We are excited to announce that plotnine v0.14.0 is now out!

Plotnine brings the grammar of graphics to #Python. In addition to an amazing new hex logo 🛸 , this latest release introduces a host of enhancements and features.

Check them out in the blog post: plotnine.org/blog/2024/11...
November 18, 2024 at 3:14 PM
Reposted by Mark Fairbanks
Making the move, and I must say… it’s quite refreshing!
November 18, 2024 at 1:53 AM
Reposted by Mark Fairbanks
seems a good place to plug Mark Fairbanks' tidytable, a go-to when I want to code tidy-style and use data.table 🤘
markfairbanks.github.io/tidytable/
Tidy Interface to data.table
A tidy interface to data.table, giving users the speed of data.table while using tidyverse-like syntax.
markfairbanks.github.io
November 15, 2024 at 6:57 PM
Reposted by Mark Fairbanks
You can use tidytable for the best of both worlds.

markfairbanks.github.io/tidytable/
Tidy Interface to data.table
A tidy interface to data.table, giving users the speed of data.table while using tidyverse-like syntax.
markfairbanks.github.io
November 14, 2024 at 7:06 PM
Reposted by Mark Fairbanks
We have our first new data.table feature release in years! 🎉🎉🎉

🔗 Blog post about the process and the new release: https://bit.ly/3Owg0c6

Huge thanks to all the hard working open-source devs who came together to make this happen. 💛

#rstats #datascience #rdatatable
2/2
February 5, 2024 at 9:03 PM