Martí Bosch
banner
martibosch.bsky.social
Martí Bosch
@martibosch.bsky.social
Doctor in Civil and Environmental Engineering, EPFL - Urban climate, Python, and a bit of remote sensing, landscape ecology and complexity - martibosch.github.io
PD: many thanks to @capetorch.bsky.social and especially @charles-irl.bsky.social for helping me to get started with @modal-labs.bsky.social 🙏
bsky.app
June 17, 2025 at 2:17 PM
Such a notebook can be executed within a local environment. Using simple Python functions, training/fine tuning and inference is run on a (GPU-enabled) Modal ephemeral app.

Then you automatically get the results as local variables in your notebook, and continue your pipeline 🚀
June 17, 2025 at 2:17 PM
I illustrate this point with an example notebook using the TreeAI Database (CC: @mirelabs.bsky.social) to fine-tune the pre-trained DeepForest tree crown detection model and then train a species classification model for each tree crown 👇

t.co/1e1LgnuNIi
https://deepforest-modal-app.readthedocs.io/en/latest/treeai-example.html
t.co
June 17, 2025 at 2:17 PM
The idea is quite simple: a full tree detection pipeline consists not only of training and inference but also many steps to preprocess the data and postprocess the results, e.g., model evaluation, plots...

Do we need a GPU server all along? No! just for training and inference 👇
June 17, 2025 at 2:17 PM
Reposted by Martí Bosch
The publisher has cut their costs by outsourcing to this company, the company has cut their costs by using AI/low-paid staff instead of paying for a proper job, while I’ve spent hours & hours fixing the manuscript, so all the extra labour from cost-costing has fallen on me, the unremunerated author
May 27, 2025 at 10:24 AM
TL;DR: despite great global standardized datasets, e.g., GHCNh, there can be many other sources of meteorological data. The central objective of meteora is to provide a standardized API for meteorological stations data in Python, making it easy to assemble multi-source datasets, e.g., for Barcelona:
April 10, 2025 at 2:12 PM
But let me ask one last time, can we get more stations? Again, the answer is yes - enter citizen weather stations (CWS). Meteora features the `NetatmoClient` to access public data from Netatmo weather stations.

The spatial availability of CWS in urban areas can be a game changer. Here is Barcelona:
April 10, 2025 at 2:04 PM
Again, can we find more stations? Yes, we can get data from the Meteorological Service of Catalonia (Meteocat) CC: @acam-cat.bsky.social

In fact, many of the Meteocat stations are featured in the GHCNh. But not all of them, i.e., there are 242 Meteocat stations vs. 93 GHCNh stations:
April 10, 2025 at 1:56 PM
But are these all the stations we can find? Obviously not. We can also use the `AEMETClient` to get data from @aemet.es.

Here we can see how we can improve the spatial density of stations by combining both sources:
April 10, 2025 at 1:56 PM
Imagine you want to get meteorological observations for any region of the world. A good starting point is always the Global Historical Climatology Network hourly (GHCNh) dataset by the @noaa.gov, which can be accessed in meteora via the `GHCNHourlyClient`.

These are the GHCNh stations in Catalonia:
April 10, 2025 at 1:56 PM
Another key feature is local request/file caching, which not only improves performance (e.g., in local notebooks) but is especially helpful with API-limited providers.

I will be adding further time-series based QC methods shortly. Stay tuned for more updates 📻
March 27, 2025 at 3:46 PM
Additionally, meteora features preliminary support for vector data cubes (using xvec), which are likely the most natural data structure for meteorological stations data and allow writing to/reading from interoperable high-performance formats such as @zarr.dev

meteora.readthedocs.io/en/latest/us...
Data structures for geospatial time series data — Meteora 0.4.0 documentation
meteora.readthedocs.io
March 27, 2025 at 3:37 PM
The supported providers are many (see the list at meteora.readthedocs.io/en/latest/su...), from global ones (e.g., GHCNh) to regional (e.g., MetOffice) and citizen weather stations (CWS) from Netatmo.
Additionally, there is a module to quality-control CWS data:
meteora.readthedocs.io/en/latest/us...
Citizen weather stations quality checks — Meteora 0.4.1 documentation
meteora.readthedocs.io
March 27, 2025 at 3:37 PM
Meteora is essentially a collection of "clients" that request meteorological observation data to different providers and process the response into a standardized data form. See the "overview" notebook for an example with METAR/ASOS data: meteora.readthedocs.io/en/latest/us...
Overview — Meteora 0.4.0 documentation
meteora.readthedocs.io
March 27, 2025 at 3:37 PM
The best part is that usually companies use AI to replace workers and save money, but if Nature (and many other publishers) were to do so, they would actually INCREASE their reviewing costs from essentially zero to some AI-related compute fees.

www.youtube.com/watch?v=8F9g...
Academic Journals Doing Crime
YouTube video by Dr. Glaucomflecken
www.youtube.com
March 6, 2025 at 1:05 PM