Python4DataScience
banner
python4data.science
Python4DataScience
@python4data.science
Teaching materials for the cusy training courses on a Python-based data science workflow: https://cusy.io/en/seminars
We have updated our tutorial to data management with DVC. It also allows you to create lightweight data science and data modelling workflows and execute them in a parameterised manner: www.python4data.science/en/latest/pr...
#Data #Versioncontrol #Git #DataScience #Modeling #Python
October 21, 2025 at 12:02 PM
We have now described how to create a configuration for Claude Code so that it uses uv reliably: python4data.science/en/latest/pr...
#ClaudeCode #Python #Packaging #uv
Configuring Claude Code for uv
How do we configure Claude Code to automatically use uv instead of pip for Python package management? Claude Code uses CLAUDE.md files to configure your project’s storage and context, ensuring a co...
python4data.science
September 24, 2025 at 12:52 PM
Since we have recently been asked frequently whether pandas is slow and whether we should use Polars, Dask or DuckDB instead, we have now provided an initial overview of the various technologies: www.python4data.science/en/latest/wo...
#Python #Performance #DuckDB
pandas
pandas is a Python library for data analysis that has become very popular in recent years. On the website, pandas is described thus: „pandas is a fast, powerful, flexible and easy to use open sourc...
www.python4data.science
September 23, 2025 at 12:53 PM
Reposted by Python4DataScience
We have finally documented Ruff – the tool greatly simplifies static code analysis for Python projects: www.python4data.science/en/latest/pr...
#Python #Ruff
Ruff
Ruff is an extremely fast Python linter and code formatter written in Rust that can enforce the rules of flake8, isort, perflint, Black, Bandit, and others. In total, Ruff can check over 800 rules....
www.python4data.science
August 25, 2025 at 2:30 PM
Reposted by Python4DataScience
💥Spack v1.0 is out!💥

This is a huge milestone. We reworked the core to add compiler dependencies, and we're introducing a stable package API.

🚀1.0 also adds concurrent builds, better includes, and much more -- read it all in the release notes!

github.com/spack/spack/...
github.com
July 20, 2025 at 10:45 AM
The XKCD comic on reproducible scientific results fits perfectly with our tutorial 🧐 😉
www.python4data.science/en/latest/pr...
July 19, 2025 at 12:12 PM
Almost more significant than the success of #Python is the growth of #Jupyter #Notebooks: “Data scientists and machine learning researchers commonly use the #OpenSource application for #MachineLearning, #DataViz, and more.”
jupyter-tutorial.readthedocs.io/en/latest/in...
July 15, 2025 at 7:53 AM
We have added a section on protomaps to our PyViz tutorial. Protomaps makes map visualisations so much easier.
pyviz-tutorial.readthedocs.io/en/latest/pr...
#Protomaps #Geography #World #Map @protomaps.com
Protomaps
Protomaps is an open source project for the creation and use of vector maps. It was developed as a lightweight alternative to conventional map providers and offers a number of advantages. Open Sour...
pyviz-tutorial.readthedocs.io
May 21, 2025 at 5:34 AM
We have expanded the section on geodata to include the most common (tile) file formats: www.python4data.science/en/latest/da...
#Geography #GIS
Geodata
File formats: PMTiles: PMTiles is a general format for tile data addressed by Z/X/Y coordinates. This can be cartographic vector tiles, remote sensing data, JPEG images or similar. HTTP Range Reque...
www.python4data.science
May 15, 2025 at 12:32 PM
We have updated our Python Basics tutorial to describe the guidelines for docstrings in more detail:
python-basics-tutorial.readthedocs.io/en/latest/do...
#Python #Documentation #DX
Docstrings
With the Sphinx extension sphinx.ext.autodoc, docstrings can also be included in the documentation. The following directives can be specified … for function-like objects: … for data and attributes:...
python-basics-tutorial.readthedocs.io
April 7, 2025 at 6:23 AM
We have expanded the section on open source hardware licences to include the TAPR and Solderpad Hardware Licence: www.python4data.science/en/latest/pr...
#OpenSource #Hardware #Licence
Licensing
In order for others to use your software, it should have one or more licences that describe the terms of use. Otherwise, it is likely to be protected by copyright. Authors are those who have origin...
www.python4data.science
April 3, 2025 at 8:19 AM
thoughtworks Technology Radar has now also adopted the tools uv and Renovate: www.thoughtworks.com/radar
Technology Radar | Guide to technology landscape
The Technology Radar is an opinionated guide to today's technology landscape. Read the latest here.
www.thoughtworks.com
April 2, 2025 at 3:21 PM
We have expanded our section on GitLab CI/CD pipelines with examples of
• GitLab Pages
• npm deployments with rsync
• building Docker containers
• multi-arch images with Buildah
• migrating GitHub Actions
www.python4data.science/en/latest/pr...
#GitLab #CICD #DevOps #DX
GitLab CI/CD
GitLab CI/CD can automatically build, test, deploy and monitor your applications during iterative code changes. This reduces the risk that you will develop new code based on buggy previous versions...
www.python4data.science
March 28, 2025 at 7:00 AM
Reposted by Python4DataScience
We have written down our experiences of how LLMs help us with programming: cusy.io/en/blog/how-...
#LLM #AI #programming #DX #Python
How LLMs help us with programming
We were recently asked by a global chemical company if we could give their engineers an introduction to programming with Python and Large Language Models (LLM). Their expectations of what they wanted ...
cusy.io
March 17, 2025 at 6:34 AM
Which Python dashboard library for which purpose?
We were left with only two candidates: Voilà and Panel: jupyter-tutorial.readthedocs.io/en/latest/da...
#DataViz #Python
Voilà vs. Panel
A major difference between Panel and Voilà lies in the processing of the notebooks: Voilà is based directly on the notebook format and transfers the entire output to the Voilà dashboard, whereas in...
jupyter-tutorial.readthedocs.io
March 8, 2025 at 5:40 PM
🎉 4000 Pythonistas and data scientists now follow us on Bluesky 🤗 We are very pleased about the great interest in our offer.
#Python #DataScience
February 28, 2025 at 6:04 AM
Our course for the versioned and reproducible storage of code and data in data science workflows is now also referenced in the official Git documentation: git-scm.com/doc/ext
#Git #DataScience #DX
Git - External Links
git-scm.com
February 17, 2025 at 11:24 AM
git stash can make working much easier. We have described some options and configurations that we use: www.python4data.science/en/latest/pr... #git #dx
Working with Git
Start working on a project: Start your own project:$ git init [ PROJECT], creates a new, local git repository.-[ PROJECT], if the project name is given, Git creates a new directory and initializes ...
www.python4data.science
February 14, 2025 at 6:37 PM
🎉 We are now on the ‘Awesome Inclusion Open Science list’ 🤗: github.com/willingc/awe...
Many thanks to @willingc.bsky.social for creating the list.
#Inclusion #OpenScience #OpenData #OpenSource
GitHub - willingc/awesome-inclusion-open-science: Curated resources for inclusive collaboration in open science.
Curated resources for inclusive collaboration in open science. - willingc/awesome-inclusion-open-science
github.com
February 7, 2025 at 10:48 AM
We have expanded our #Git section:
• Add diff source and destination prefix
• Add default branch config for init
• Add git-symbolic-ref
• Add Git Credential Store for Linux
• Update shallow clones
• Add shell config
• Add shell config and command line tools
www.python4data.science/en/latest/pr...
Manage code with Git
To gain better control over your source code, it is usually managed with Git. Git is a mature and very actively maintained open source project originally developed in 2005 by Linus Torvalds, the in...
www.python4data.science
January 30, 2025 at 4:29 PM