brendan chambers
societyoftrees.bsky.social
brendan chambers
@societyoftrees.bsky.social
Ithaca | prev Chicago | interested in interconnected systems and humans+computers | past and future: academic and industry research | currently: gardening
tldr of Andy’s back of envelope math: in Morrow County data centers may actually be accounting for only ~1% of local wastewater
November 26, 2025 at 6:40 AM
the local agriculture was highly polluting but drew water primarily from a river

while the data centers drew from the (poisoned) water table,

competing with residents for the deepest wells and sending outputs into a processing ponds that couldn’t handle the capacity
November 26, 2025 at 6:03 AM
so Morrow County Oregon seems to be an example of a drinking water crisis accelerating b/c of data center buildouts

the original crisis was caused by agriculture, but the scale of the issue worsened because of data center waste water handling
November 26, 2025 at 6:03 AM
Reposted by brendan chambers
Test-time reasoning guidance: up to 66.7% improvement 💡

We scaffold cognitive structures from successful traces to guide reasoning.

Major gains on ill-structured problems🌟

Models possess latent capabilities—they just don't deploy them adaptively without explicit guidance.
November 25, 2025 at 6:26 PM
Reposted by brendan chambers
We analyzed 1,598 LLM reasoning papers:

Research concentrates on easily quantifiable behaviors—sequential organization (55%), decomposition (60%)

Neglects meta-cognitive controls (8-16%) and alternative representations (10-27%) that correlate with success⚠️
November 25, 2025 at 6:26 PM
Reposted by brendan chambers
Our taxonomy bridges cognitive science → LLM eval:

28 elements across 4 dimensions—reasoning invariants (compositionality, logical coherence), meta-cognitive controls (self-awareness), representations (hierarchical, causal), and operations (backtracking, verification)
November 25, 2025 at 6:26 PM
I have been wondering about this too. I’m still a bit unclear about issues like water table health and waste heat. Andy’s writing is a good reminder how incredibly water-intensive agriculture is, too
November 14, 2025 at 8:52 PM
Reposted by brendan chambers
🌊 Global Mangrove Watch is using OlmoEarth to refresh mangrove map baselines faster, with higher accuracy & less manual annotation—allowing orgs + governments to respond to threats more quickly.
Learn more → buff.ly/6xLHLk6
November 4, 2025 at 2:53 PM
More work looking into reverse KL in the context of distillation. Missed this at the time, looking forward to reading

arxiv.org/pdf/2306.08543
arxiv.org
October 28, 2025 at 7:05 PM
🤖
October 28, 2025 at 6:37 PM
It was great to have a reason to look more closely at Agarwal et al again. I first saw this work back in my quillbot era, via a great colleague (not naming them without permission)…brought back some good memories from 2023/2024
October 28, 2025 at 6:37 PM
It makes me wonder, has any other work looked at this trick (mixing reverse KL into the loss) during earlier stages of training to mitigate drift in long tail activations? How about work investigating mode-dropping and divergence measures?
October 28, 2025 at 6:37 PM
In the Thinking Machines post, for this post-training stage they discuss reverse KL only. Agarwal et al suggests interpolating with Jensen Shannon divergence might be worth exploring too, especially if excessive mode-dropping becomes an issue.
October 28, 2025 at 6:37 PM
In Agarwal et al, across different tasks, the optimal amount of forward/reverse interpolation seemed to vary (though it’s a bit risky to interpret ROUGE and BLEU alone, and this might be an artifact of the evaluation strategy) but the best approach was always a mixture, especially when sampling.
October 28, 2025 at 6:37 PM
A good refresher on forward/reverse KL divergence and notation conventions is:
agustinus.kristia.de/blog/forward...
October 28, 2025 at 6:37 PM
We usually don’t give much thought to how forward KL (and cross entropy) losses fail to directly penalize student mistakes where the teacher distribution has no coverage. The choice of divergence also impacts mode-dropping—super relevant to capacity reduction during distillation.
October 28, 2025 at 6:37 PM