Author | Lightnews

Christian Düben @cdueben.bsky.social · 14d

It is generally a good idea to use TeX Live via a dev container because this enforces the same setup for all coauthors. The container is compatible with but not part of the SoDa Replicator itself because it is not technically required and because it demands prior knowledge of WSL, Docker, etc. 2/2

Christian Düben @cdueben.bsky.social · 14d

Since publishing the Monash SoDa Replicator, users have told me about Windows laptops provided by their university not allowing for local TeX Live installation. Hence, I have created a dev container that circumvents this problem: github.com/cdueben/tex_.... #tex #texlive #devcontainer #vscode 1/2

GitHub - cdueben/tex_live_dev_container: Dev Container for TeX Live.

Dev Container for TeX Live. Contribute to cdueben/tex_live_dev_container development by creating an account on GitHub.

github.com

1 1

Christian Düben @cdueben.bsky.social · 27d

We have published a video #tutorial on the Monash SoDa Replicator, a template for #reproducible #academic #research : youtube.com/playlist?lis.... #git #github #tex #ai #replication #productivity #collaboration

SoDa Replicator - YouTube

Tutorial on the Monash SoDa Replicator (https://github.com/sodalabsio/soda_replicator), a template for reproducible academic research. Published by Monash Un...

youtube.com

2

Christian Düben @cdueben.bsky.social · Sep 9

Hence, the package itself is very quick and simple to install from source. It is easy to use and comes with a number of examples.

3

Christian Düben @cdueben.bsky.social · Sep 9

David Kreitmeir and I published the synthetic matching method from Kreitmeir et al. (2025) as an #rstats package on CRAN (cran.r-project.org/package=synt...). Whereas my other recent R packages are primarily wrappers of my C++ code, synthReturn is a full R package (Fortran only in dependencies).

synthReturn: Synthetic Matching Method for Returns

Implements the revised Synthetic Matching Algorithm of Kreitmeir, Lane, and Raschky (2025) <<a href="https://doi.org/10.2139%2Fssrn.3751162" target="_top">doi:10.2139/ssrn.3751162</a>>, building...

CRAN.R-project.org

1 2 6

Christian Düben @cdueben.bsky.social · Sep 5

My cppcontainers #rstats package (cran.r-project.org/package=cppc...) is back on CRAN. The functionality remains unchanged. It had just been archived on CRAN a few months ago because of NA placeholder values that threw compiler warnings in exotic builds on macOS. #cpp #datascience

cppcontainers: 'C++' Standard Template Library Containers

Use 'C++' Standard Template Library containers interactively in R. Includes sets, unordered sets, multisets, unordered multisets, maps, unordered maps, multimaps, unordered multimaps, stacks, queues, ...

cran.r-project.org

1 3

Christian Düben @cdueben.bsky.social · Aug 9

Enjoying my first time at @defcon.bsky.social. Amazing talks, demos, etc. so far. And I met famous journalist @jackrhysider.bsky.social whose inspiring work got me interested in #infosec in the first place. I am gathering a lot of insights for my own research.

1

Christian Düben @cdueben.bsky.social · Jul 7

#AI is even better than I am in adding seg faults to my code 😅. Maybe vibe coding a Fibonacci heap into my spatial analysis was not the best idea. #cpp

1

Christian Düben @cdueben.bsky.social · Jul 2

I tried #Positron today. Forking and customizing VS Code instead of further pushing RStudio was a good choice. RStudio is not bad, but Positron seems nicer (and better for #rstats development than VS Code itself). Good job @posit.co👍.

1 1 6

Christian Düben @cdueben.bsky.social · Jun 4

Yes, I agree, Python's package management is a terrible user experience. CRAN feels ok to users, but is a bad package developer experience 😅.

Christian Düben @cdueben.bsky.social · Jun 4

That depends on the field. Many people in my environment (economists) have migrated to Python.
Modern R and Python packages are just C and C++ wrappers anyway. Native R and Python are not performant enough for modern tasks.

1

Christian Düben @cdueben.bsky.social · Jun 4

Statistical programs in Python tend to build on numpy and pandas. Given how mature and stable these packages are, I would not consider these dependencies a major drawback of Python nowadays.
Base R also only has dense matrices. More elaborate matrix types require a package.

2

Christian Düben @cdueben.bsky.social · Jun 4

Fair point 😄.

2

Christian Düben @cdueben.bsky.social · Jun 4

Unfortunately, Julia never gained enough traction and never developed a reasonably sophisticated package ecosystem. So, users will stick to the mediocre choices of #rstats and Python. 6/n

1 1 5

Christian Düben @cdueben.bsky.social · Jun 4

According to my own observation, there is a considerable shift from #rstats to Python among empirical researchers. A key driver of this is the Python-first nature of machine learning APIs. I think that #rstats will lose market share to Python, but it will not die soon. 5/n

1 4

Christian Düben @cdueben.bsky.social · Jun 4

For now, the simplicity of #rstats makes it a better choice than Python for many users who do not need the versatility of a general purpose programming language. It keeps attracting users from outdated commercial software like Stata or SPSS. 4/n

1 1 8

Christian Düben @cdueben.bsky.social · Jun 4

The poor management of CRAN does not help either. Both R Core and CRAN need to fundamentally change for #rstats remain popular. 3/n

1 4

Christian Düben @cdueben.bsky.social · Jun 4

Compare the release notes of base R and #Python over the course of the past years. Python is leaping forward while base R is devoid of innovation. #rstats is kept alive and thriving by its packages. Unfortunately, the severe limitations of base R also limit the scope of package development. 2/n

1 5

Christian Düben @cdueben.bsky.social · Jun 4

There is a lot of childish discussion on the future of #rstats recently on this platform. #rstats is my main language and I will stick to it for a while. However, posts calling it perfect and denouncing any critique as invalid are ridiculous. #rstats has flaws that can threaten its future. 1/n

1 1 7

Christian Düben @cdueben.bsky.social · Apr 29

I need to put together a deep learning model and I do not know whether I should go with #pytorch or #tensorflow + #keras. I have a little experience in the latter and none in the former. What would you recommend for someone who does not care about how pythonic a tool is?

1 1

Christian Düben @cdueben.bsky.social · Apr 29

I just had to install the most intrusive #software of my life. It automatically starts at boot, but does not show up in the OS's auto startup list. It does not have an exit button or settings. I have to shut down the background process in the process viewer/ task manager each time. 🫣

1

Christian Düben @cdueben.bsky.social · Apr 5

The new version of my spaths #rstats package (cran.r-project.org/package=spaths) removes the edge limit (previously 2^31 - 1). Do you need to handle 10 billion edges with custom transition functions? No problem, spaths can do that. #graph #theory

spaths: Shortest Paths Between Points in Grids

Shortest paths between points in grids. Optional barriers and custom transition functions. Applications regarding planet Earth, as well as generally spheres and planes. Optimized for computational per...

cran.r-project.org

2 2

Christian Düben @cdueben.bsky.social · Apr 2

😅 Considering R Core's absence of innovation in base R development, a reimplementation in Rust would certainly be a surprise.

1 2

Christian Düben @cdueben.bsky.social · Mar 19

Over the course of the past years, I have received a number of emails asking why conleyreg (my first #rstats package) and fixest produce different results. Contributors to an issue in fixest's GitHub repo have now shown that both are economically correct. The assumptions just differ.

2 1

Christian Düben @cdueben.bsky.social · Mar 18

We already got complaints from a data editor when code using pre-processed data took a few hours to run on his laptop. Code on raw data does not run at all or takes weeks on end user devices.

1