FOIMonkey
banner
foimonkey.bsky.social
FOIMonkey
@foimonkey.bsky.social
Recovering FOI enthusiast and polyglot with a developing machine learning habit.
I’ve extracted and analysed all the Public Interest arguments made in every decision ever issued by the Scottish Information Commissioner. It's completely changed my understanding of how PITs work in practice. Looking at the 25 most common "winning" factors, there were some I'd not thought of.
February 3, 2026 at 12:31 PM
2025 maybe?
January 2, 2026 at 2:39 PM
TIL that over the years, I have added or updated 34,724 unique public authorities on WhatDoTheyKnow. That's 74% of the total. It's all my fault 😅
November 28, 2025 at 12:10 PM
The decision by a number of local councils to run adverts on their websites didn't quite sit right with me. I couldn't quite work out why until I saw an advert for a credit card on the crisis loan page.
November 14, 2025 at 12:11 PM
When I find failed redactions or accidental releases of PII in FOI responses I will usually notify the authority. The response is mixed, but far too often there is just silence. Often the only way I know they've got my message is by seeing if the file has disappeared. You'd think they'd want to know
September 22, 2025 at 11:15 AM
Microsoft seems to have pulled the larger vibevoice TTS model from huggingface, and the github repo 404s github.com/microsoft/Vi.... It's not been out for long, but I can't be alone in having both downloaded and it's MIT licensed, so there is nothing to stop mirrors. I wonder what the issue is? 🤔
https://github.com/microsoft/VibeVoice)
September 4, 2025 at 9:03 AM
The SSD drive of shame has hit 2,752,077 files. Of course I haven't plugged in a new one rather than face clearing it 😅
August 27, 2025 at 12:43 PM
As a bit of fun, I used some of the WDTK keywords that I created last week to create synthetic FOI requests using mistral-small. I then finetuned SmolLM2-360M-Instruct on those outputs to generate requests from 3 keywords and the authority name: huggingface.co/HMC83/reques...
HMC83/request_writer_smol · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
August 18, 2025 at 12:02 PM
I've uploaded descriptive keywords for over 1 million public FOI requests. I have left it at with request_id, keywords and the name of the public authority for now: github.com/FOIMonkey/fo...
August 15, 2025 at 12:21 PM
As a side effect of this, I have generated keywords for over 1 million FOI requests. Figuring out how to do that well in the most lightweight way was a journey in and of itself. I've not looked yet, but combined with authority and outcome data, it should be possible to spot some interesting trends.
August 14, 2025 at 1:50 PM
Turns out you don't need to read an FOI response to start to be able to guess the outcome. I trained a TF-IDF classifier with a 73% macro F1-score in predicting success using just 3 keywords about the request and metadata. Adding the full request text hits 76% & a snippet of the response email 84%.
August 14, 2025 at 1:43 PM
I wrote up some quick notes on yesterday's journey to nowhere:
foimonkey.github.io/posts/12-hou...
12 hour Public Transport challenge
Objective Travel on as many different types of public transport as possible within a 12 hour window, starting and finishing in the same location.
foimonkey.github.io
August 7, 2025 at 2:50 PM
Made it back to Cowes in 11 hours 39 minutes. Taking the floating bridge x 2, a double decker bus, a Hovercraft, a single decker bus, a tram, the overground, DLR X 2 (got on the wrong train), the cable car, a catamaran, the underground, an automatic people mover, 3 x trains, and the vehicle ferry.
August 6, 2025 at 5:26 PM
Going to see how many different forms of public transport it is possible to take in one day today. First up Floating bridge.
August 6, 2025 at 5:36 AM
I took 40,000 images from emails in the WhatDoTheyKnow archive and extracted the five most dominant colours from each (excluding monochrome/shades of grey). Behold the palette of the UK public sector.
August 3, 2025 at 3:06 PM
Then I'm going to have another go at teaching it Welsh. Did the 1B version on the weekend, which showed promising results in terms of picking up the vocab/grammar vs the base model. A larger, more capable model and a more curated dataset seem like the way to go there if it is to be useful.
July 29, 2025 at 11:43 AM
I'm starting a couple of new projects today. First up, trying to make OLMo-2 better at reasoning. I want to have a go at teaching it to generate <think> tokens showing its chain-of-thought as it answers. It will be a couple of days before I know if it has worked.
July 29, 2025 at 11:40 AM