Dustin Miller
banner
spdustin.com
Dustin Miller
@spdustin.com
Progressive, LGBTQ ally, feminist, autism dad, ADHD-er, and huge nerd (OSINT, martech, machine learning/AI, home automation, electronics). I believe abortion is healthcare, and that we have too many guns in this country.
Well, I thought I was halfway through getting the damned vision model trained, but alas...

It's still coming along, though, and updates have been posted.

Any interested parties can check out this thread for details and info on how to help.
For those of you watching my structured extract of the House's Epstein Estate Document dump, I've started to commit emails to the repo.

I killed a lot of time trying to train Detectron2 to better handle all these various emails, but the redactions and weird merges of emails and docs were...

(cont)
GitHub - spdustin/epstein-estate-documents
Contribute to spdustin/epstein-estate-documents development by creating an account on GitHub.
github.com
November 15, 2025 at 1:24 AM
Unless y'all know someone at Google or Anthropic (or both, why not both) who can provide comped API credits, it'll be very slow going, which makes me sad.

Donations are welcome; I'm $spdustin on cashapp and @spdustin on Venmo. Funds will be used to upgrade subscriptions to get higher rate limits.
November 15, 2025 at 1:23 AM
...fraught.

So I'm using LLM+Vision models (which each need their own special hand-holding) and structured outputs to process the emails.

Unfortunately, I can't afford to use Claude or Gemini via API for this scale, so I'm limited by the rate limits imposed on the lowest-tier paid subscriptions.
November 15, 2025 at 1:23 AM
I’m about halfway through a proper OCR run with segmented layout regions to extract actual metadata from the emails. I hope to post a proper update to the corpus tomorrow along with graphs between named entities.

And the iMessage transcripts are about to get an update unmasking a few names.
GitHub - spdustin/epstein-estate-documents
Contribute to spdustin/epstein-estate-documents development by creating an account on GitHub.
github.com
November 14, 2025 at 9:29 AM
Happy to help! There are more updates coming tomorrow to make it much easier to navigate.
November 14, 2025 at 8:36 AM
ICYMI: I also created a corpus from Project 2025 that is friendly to both researchers and LLMs, complete with rich metadata, guidance on splitting/chunking, and tips on using LLMs to assist in grounded research.
GitHub - spdustin/Project-2025
Contribute to spdustin/Project-2025 development by creating an account on GitHub.
github.com
November 13, 2025 at 7:10 PM
Still to come: pipe emails through a better OCR pipeline than the House used (using CV to ID headers, quotes, etc.), updated Concordance/Opticon stores with richer metadata, LLM-friendly versions for "chat with docs" workflows.

Check README to see what's ready now, and for important cloning info!
November 13, 2025 at 6:56 PM
I haven't missed it at all. Deleted an active account I had since the aughts, and haven't experienced a single pang of regret.
November 12, 2025 at 4:49 AM
And don’t be surprised if your GPS is all wonky for the next day or so.
November 12, 2025 at 4:14 AM
NE IL here
Around I-80/I-55 intersection in Illinois, aurora visible to 3-second night mode exposure on iPhone 15 Pro.
November 12, 2025 at 4:13 AM
Even if you don’t see colors in your northern sky, just TRY a night mode photo on your iPhone/Android. You will be surprised at what 3 seconds can do.

Brace the phone on a stable surface for best results.
November 12, 2025 at 4:10 AM
First sensual Alf label I’ve seen in a minute, and it was everything I wanted.
November 12, 2025 at 4:07 AM
Thanks for your reporting here, José, it’s important work. Does ProPublica make available the original corpus of records or the normalized datasets? It would be helpful to understand specific communities impacted, and can lead to more informed outreach to those experiencing food insecurity.
October 13, 2025 at 10:25 PM
Pleasantly surprised to see this one on my feed, can't wait for more in the series!

I can't contribute to the Patreon, but if you ever find yourself in the far SW burbs, my DMs are open, and we can nerd out about home automation, electricity, dehumidifiers, and all things CAN bus.
October 3, 2025 at 7:09 PM
Just wait until they find out Scamuel L. Jackson (thanks, Scam Goddess pod, I’ll never stop laughing at the nicknames) converted to Islam. Universe-breaking levels of cognitive dissonance.
September 22, 2025 at 11:18 PM
I think they’re killing it as Ellie, tbh. And the changes that Nick and Craig made for the show were either changes that Nick wanted in the game in the first place, or both of them recognizing that handling converging story arcs in a game is VERY different from television. Payoff has to come sooner.
May 29, 2025 at 5:19 PM
*sigh*

They.
May 29, 2025 at 5:15 PM
I swear on all that is holy I, an avowed home automation and gadget nerd, will never buy another gadget that needs to access the internet for anything. And I’ll pay more if it itself can be controlled on my LAN via a simple and documented API.
May 24, 2025 at 12:25 AM
I want to go to there
May 24, 2025 at 12:19 AM