Lightnews — Scholar-powered news

Daniel Mewes

@dmewes.com

100% youtu.be/FyStPztM7eo . Things are really off in our current society. (video by @hankgreen.bsky.social )

Truth

YouTube video by Hank Green

youtu.be

January 14, 2026 at 2:40 AM

Daniel Mewes

@dmewes.com

As someone who still finds typing on a touchscreen incredibly slow and frustrating, I'm very excited about the Clicks Communicator. It's basically a BlackBerry! clicksphone.com/communicator
Even comes with a notification LED!

Clicks Communicator: the ultimate communciation companion

Clicks Communicator is phone purpose-built for taking action and communicating in a noisy world with deeper context, versatile input and greater control in a compact design.

clicksphone.com

January 3, 2026 at 6:55 AM

Daniel Mewes

@dmewes.com

Has anyone tried the Nyxt browser? nyxt.atlas.engineer

It's cool that it's written in Lisp and comes with a Repl. I'm running into some slow UI updates though.

Nyxt browser: The hacker's browser

nyxt.atlas.engineer

December 20, 2025 at 6:01 AM

Daniel Mewes

@dmewes.com

GPT 5.2 (thinking on high) seems to be buggy. I'm seeing reproducible issues where it starts glitching out in the middle of its response with garbage tokens and then ends the message.
Highly reproducible for me every 10th-20th prompt or so.

Screen photo showing a model response that is emitting Python code, and then ends in two Unicode characters before ending suddenly.

December 17, 2025 at 5:47 PM

Daniel Mewes

@dmewes.com

Gemini 3 Flash beats Gemini 3 Pro in both ARC-AGI 2 (33.6% vs 31.1%) and in SWE-bench Verified (78% vs 76.2%). 1/4th the cost and faster.
This is very very impressive. Is there still a reason to use 3 Pro?
blog.google/products/gem...

Gemini 3 Flash: frontier intelligence built for speed

Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.

blog.google

December 17, 2025 at 5:41 PM

Daniel Mewes

@dmewes.com

LLMs can generate arbitrary apps on the fly. Without ever writing source code.

I wrote about this here:
wp.me/pdyMeH-bd

This was really fun to play around with. You can try it out yourself using github.com/danielmewes/...

Hallucinate any App, One Screen at a Time

A major trend of this year has been vibe coding – using LLMs to create software from the ground up, without a human ever interacting with the source code directly. The current consensus is th…

wp.me

December 10, 2025 at 5:33 PM

Daniel Mewes

@dmewes.com

I do think it's most likely that this was indeed due to teleoperation. But I also think it would be funny if this behavior got accidentally trained into the model due to operator-gathered training data that didn't get trimmed correctly. :D

James Vincent @jjvincent.bsky.social · Dec 8

can't stop watching this clip of a tesla Optimus teleoperator taking his headset off before properly logging out the robot

December 9, 2025 at 7:09 PM

Daniel Mewes

@dmewes.com

Poetiq's methodology on top of Gemini 3 & GPT 5.1 exceeds average human performance on ARC-AGI-2!
This is huge.

Only caveat is that they evaluated on the public set - it might have been used in post training of Gemini 3? Looking forward to see private eval results! poetiq.ai/posts/arcagi...

Traversing the Frontier of Superintelligence

Poetiq is proud to announce a major milestone in AI reasoning. We have established a new state-of-the-art (SOTA) on the ARC-AGI-1 & 2 benchmarks, significantly advancing both the performance and the e...

poetiq.ai

November 21, 2025 at 5:56 PM

Daniel Mewes

@dmewes.com

There have been a few works that showed signs of LLM introspection since I wrote my article. Though the consensus still seems to be that it's quite unreliable. amongai.com/2024/12/24/l...

October 29, 2025 at 7:01 PM

Reposted by Daniel Mewes

Simon Willison

@simonwillison.net

This was a tough but necessary decision - I posted my own notes on this here, from the perspective of a current PSF board member simonwillison.net/2025/Oct/27/...

October 27, 2025 at 8:34 PM

Daniel Mewes

@dmewes.com

I have to admit the mechanism behind cable impedance and impedance matching never really clicked for me, despite being a licensed radio amateur for 20 years.
...until I watched this video by AlphaPhoenix - it's such an amazing visualization of what's going on in a cable! youtu.be/RkAF3X6cJa4?...

What does "impedance matching" actually look like? (electricity waves)

YouTube video by BetaPhoenix

youtu.be

October 24, 2025 at 3:22 PM

Daniel Mewes

@dmewes.com

Claude Haiku 4.5 outperforms Sonnet 4 in some coding benchmarks (such as SWE-bench Verified). This is exciting, since it's 1/3 the price.

However, in my actual use, I've found it to be a bit underwhelming compared to Sonnet 4 (not to mention Sonnet 4.5).

What has been your experience with Haiku?

October 23, 2025 at 5:35 PM

Daniel Mewes

@dmewes.com

This is so true.
I think almost nobody is talking about symbolic GOFAI these days, so I'm not concerned about that.
But all of ML being re-branded to AI lately, while AI has simultaneously been made synonymous with generative AI in many places, has led to so much confusion.

Ethan Mollick @emollick.bsky.social · Oct 22

The fallout from the fact that data science/classical machine learning & generative AI are both called "AI" has been remarkably broad & persistent

Policy addresses the wrong harms, companies have been confused about who should lead efforts, hiring is misguided, academic discussion is often muddled.

October 22, 2025 at 5:51 PM

Daniel Mewes

@dmewes.com

The age of single-serving, disposable (simple) software is here. I now frequently have LLMs write software for me that I use exactly once.
Often to convert some data or create visualizations, but also one-off new features to add to some open-source application that I only want to use once.

October 8, 2025 at 12:07 AM

Reposted by Daniel Mewes

thebes

@vgel.me

new blog post! why do LLMs freak out over the seahorse emoji? i put llama-3.3-70b through its paces with the logit lens to find out, and explain what the logit lens (everyone's favorite underrated interpretability tool) is in the process.

link in reply!

October 5, 2025 at 2:36 PM

Daniel Mewes

@dmewes.com

Today we're releasing Sculpture, an AI coding tool that combines async agents with the ability to collaborate with agents locally.

It also comes with built-in verifiers to automatically check the quality of AI written code. More to come! imbue.com/sculptor-ann...

A picture of @kanjun.bsky.social making an excited gesture, with the label "developers, developers, developers" written over it.

September 30, 2025 at 5:04 PM

Daniel Mewes

@dmewes.com

I used AI vibe coding tools to port Anthropic's API client to Common Lisp: github.com/danielmewes/...
It was a very quick and fun mini project that taught me a thing or two about Lisp. Use it at your own risk.

GitHub - danielmewes/anthropic-sdk-cl-port: An AI-written port of the Anthropic client SDK to Common Lisp.

An AI-written port of the Anthropic client SDK to Common Lisp. - danielmewes/anthropic-sdk-cl-port

github.com

September 22, 2025 at 4:42 PM

Daniel Mewes

@dmewes.com

I feel like each of Anthropic's three post mortems is missing some key explanation step in their root cause descriptions. www.anthropic.com/engineering/... 🧵

A postmortem of three recent issues

This is a technical report on three bugs that intermittently degraded responses from Claude. Below we explain what happened, why it took time to fix, and what we're changing.

www.anthropic.com

September 18, 2025 at 5:47 PM

Daniel Mewes

@dmewes.com

My Hulu / Disney+ subscription will be preempted indefinitely.

September 18, 2025 at 5:14 AM

Reposted by Daniel Mewes

Ethan Mollick

@emollick.bsky.social

As AI systems keep getting better at very hard problems while getting more opaque, the way that we work with AI is shifting shifting from being collaborators who shape the process to being supplicants who receive the output.

I discussed what that means. www.oneusefulthing.org/p/on-working...

On Working with Wizards

Verifying magic on the jagged frontier

www.oneusefulthing.org

September 11, 2025 at 8:55 PM

Reposted by Daniel Mewes

Sung Kim

@sungkim.bsky.social

Meta trained a special “aggregator” model that learns how to combine and reconcile different answers into a more accurate final one, instead of relying on simple majority voting or reward model ranking on multiple model answers.

September 9, 2025 at 2:03 PM

Reposted by Daniel Mewes

Ethan Mollick

@emollick.bsky.social

The funny thing about the prediction that AI would be writing 90% of all code by now is that the prediction's failure distracts from the fact that AI adoption in code writing is actually extremely high, it was already over 30% in December, 2024 according to one measure, with large economic impact.

September 3, 2025 at 4:19 PM

Daniel Mewes

@dmewes.com

I remember seeing this graph and thinking it was about the influence of training data on LLM responses as well. Quite misleading.

Ethan Mollick @emollick.bsky.social · Sep 2

This chart is everywhere and is being horribly misinterpreted.

This is not where the training data for AI comes from, it is a study done by a SEO firm that claims to show how often sites come up at least once in THE WEB SEARCH FUNCTION of certain AI agents when they do a web search for more info.

September 2, 2025 at 3:52 AM

Daniel Mewes

@dmewes.com

I think we're starting to see diminishing returns from LLM pre- and post-training. The limitations of today's LLMs are unlikely to just disappear with the next bigger model.
This is not all bad: we can start focusing on how to work around those limitations and how to put current LLMs to work.

August 29, 2025 at 4:25 PM

Daniel Mewes

@dmewes.com

Just experienced phantom breaking for the first time in my Kia with HDA2 (Highway Drive Assist) and it made me sentimental about the exciting times of early Tesla Autopilot 😢

August 29, 2025 at 4:00 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news