Lightnews — Scholar-powered news

Avik Dey

@avikdey.bsky.social

460 followers 430 following 670 posts

Mostly Data, ML, OSS & Society • Stop chasing Approximately Generated Illusions; focus on Specialized Small LMs • To understand it well enough, learn to explain it simply • Shadow self of https://linkedin.com/in/avik-dey, have a beard now

Posts Replies Media Videos

Avik Dey

@avikdey.bsky.social

Still hasn’t read Ilya’s memo …

this disconnect between eval performance and actual real-world performance,

December 3, 2025 at 2:12 AM

Avik Dey

@avikdey.bsky.social

Ilya finally answers the question: What did Ilya see?

“this disconnect between eval performance and actual real-world performance,”

Next time someone goes - LLMs beat ‘So & So’ Olympiad - just quote Ilya.

November 27, 2025 at 5:42 PM

Avik Dey

@avikdey.bsky.social

Having faced this exact same repetitive issue since 2023, I would have laughed at this - if we didn’t have 1% of the GDP invested in this caricature of an “AI”.

www.dwarkesh.com/p/ilya-sutsk...

November 25, 2025 at 9:40 PM

Avik Dey

@avikdey.bsky.social

Ilya appears to be progressively approaching the right conclusion. Remain confident that in time he will consolidate his insights from first 5 minutes and recognize that complex explanations are unnecessary when simpler ones suffice.

(screenshots not chronological)

www.dwarkesh.com/p/ilya-sutsk...

November 25, 2025 at 8:17 PM

Avik Dey

@avikdey.bsky.social

Good to see research on what math always said - low-average performers that’s your LLM “employee”:

> This supports our assertion that the ceiling on LLM creativity (0.25) corresponds to the boundary between little-c and Pro-c human creative performance (Figure 6).

www.academia.edu/144621465/_T...

November 25, 2025 at 5:19 PM

Avik Dey

@avikdey.bsky.social

“warm-up”: Under the guidance of an expert human the model was finally able to get the answer right when nudged towards it.

Not the model, not the prompt - still the human.

The amount of shilling these guys do, no wonder they can’t get anything serious built.

cdn.openai.com/pdf/4a25f921...

November 23, 2025 at 5:33 PM

Avik Dey

@avikdey.bsky.social

Think they might have answered their own question … ?

bsky.app/profile/slas...

November 22, 2025 at 4:04 AM

Avik Dey

@avikdey.bsky.social

If these Gemini 3 Pro benchmarks are accurate, time for OpenAI to sell to Microsoft. Microsoft won’t want their management team or their prolifically tweeting engineers, but I am sure most engineers would thrive if led by seasoned engineering management.

storage.googleapis.com/deepmind-med...

November 18, 2025 at 4:51 PM

Avik Dey

@avikdey.bsky.social

Perfect prediction, even if I say so myself!

Actually their realization dawned a few weeks back, but these things take a little while to surface externally.

Image of tweet from bird site because I won’t link to it.

November 16, 2025 at 1:45 AM

Avik Dey

@avikdey.bsky.social

From the bird site, the acceleration continues:

November 16, 2025 at 1:30 AM

Avik Dey

@avikdey.bsky.social

The real star of the show is:

November 14, 2025 at 5:42 PM

Avik Dey

@avikdey.bsky.social

Mensch goes on to dunk on pre training. This paragraph rhetorically sounds good but is both technically and ethically shallow. Data and pre training IS the entire foundation of the model. He discounts both to shift onus from AI companies to the “deployer”. Complete hogwash; he learnt well from Sam.

"We need to remember that we're talking about data-sets that are thousands of billions of flops. So how, based on this data set, how are we going to know that we've done a good job at having no biases in the output of the data model? And in fact, the actual, actionable way of reducing biases in model is not during pre training, so not during the phase where you see all of the data- set, it's rather during fine tuning, when you use a very small data-set to set these things appropriately. And so to correct the biases it's really not going to
help to know the input data set."

November 12, 2025 at 4:11 PM

Avik Dey

@avikdey.bsky.social

Demo coming soon …

bsky.app/profile/avik...

November 5, 2025 at 4:06 PM

Avik Dey

@avikdey.bsky.social

Every author writing like this should be required to rewrite abstracts in plain English and read it aloud to an audience of their peers, before they can publish it.

Summary: Conjectural with nice diagrams but no quantitative measures and ignores prior literature.

arxiv.org/pdf/2510.26745

November 3, 2025 at 4:14 PM

Avik Dey

@avikdey.bsky.social

Karpathy’s tweet is a live demo of the learning loop he promotes. Consciously or not, he is channeling:

- Kolb: Experimental learning theory
- Feynman: Explain in your own words
- Dweck: Growth mindset scale

The medium is the message.

November 1, 2025 at 4:16 PM

Avik Dey

@avikdey.bsky.social

She’s making the classic layman’s mistake of thinking DeepMind is synonymous with AI. If she had actually read even the first paragraph of their paper, she might have clued in it’s a great example of purely statistical machine learning, but that’s probably asking too much.

arxiv.org/pdf/2506.10772

October 30, 2025 at 4:02 PM

Avik Dey

@avikdey.bsky.social

Today for some reason, I felt the urge to share this tweet posted back in 2018:

Tweet 1: its funny when you convert all of your websites & apps to run on a single cloud provider after you have pitched why you do not want to rely on a single cloud provider. really not funny. hopefully the software
lives on, continues getting adopted & maintains cloud independence.

Tweet 2: otherwise your 2022's cloud usage fees will look very much like your 2005's database licensing fees.

October 20, 2025 at 6:22 PM

Avik Dey

@avikdey.bsky.social

Read this if you work with LLMs. For those of us who have been hands on, both under the hood and in front of the shiny bits, it’s been obvious from the get go.

Folks still refusing to acknowledge the obvious are invested in it, directly or indirectly.

For the rest of us it’s just another tool.

… because people are afraid to say it. Mid-level managers and individual workers who know this is the common-sense view on Al are concerned that simply saying that they think Al is a normal technology like any other, and should be subject to the same critiques and controls, and be viewed with the same skepticism and care, fear for their careers. People worry that not being seen as mindless, uncritical Al cheerleaders will be a career-limiting move in the current environment of enforced conformity within tech, especially as tech leaders are collaborating with the current regime to punish free speech, fire anyone who dissents, and embolden the wealthy tycoons at the top to make ever-more-extreme statements, often at the direct expense of some of their own workers.

This is all exacerbated by the awareness that hundreds of thousands of technical staff like engineers have been laid off in recent times, often in an ongoing drip of never-ending layoffs, and very frequently in an unnecessarily dehumanizing and brutal process intended to instill fear in those who remain at the companies afterward.

October 18, 2025 at 11:20 PM

Avik Dey

@avikdey.bsky.social

If even Karpathy can’t get AI coding to work for him, are you willing to bet on it working for you? Your’s are IID you say? You will soon find out dimensionality means yours are OOD too.

October 13, 2025 at 9:48 PM

Avik Dey

@avikdey.bsky.social

Nice paper. Observations:

- Verifier assisted code snippet optimization
- Before solutions are sub-optimal as are the after solutions
- Some of the paper’s commentary is contradictory
- If framework included chaos monkey to simulate real world, would these hold?

arxiv.org/abs/2510.061...

October 10, 2025 at 4:47 PM

Avik Dey

@avikdey.bsky.social

All those words just to say - ‘If you view them thru an anthropomorphic lens, LLMs are just shallow replicas of human intelligence.’

Buddy - the rest of us knew.

(Not going link, but it’s on the bird site)

October 2, 2025 at 4:08 PM

Avik Dey

@avikdey.bsky.social

Used to be Sunday breakfast staple with “luchi”, in my younger days.

August 29, 2025 at 8:18 PM

Avik Dey

@avikdey.bsky.social

Something like this? Very popular in other parts of Asia too.

i.ytimg.com/vi/fGODsn6NR...

August 29, 2025 at 7:01 PM

Avik Dey

@avikdey.bsky.social

Was always the plan. A lawsuit needs to be filed to save copies of all of those server logs indefinitely.

bsky.app/profile/tech...

August 26, 2025 at 5:39 PM

Avik Dey

@avikdey.bsky.social

Yes, they are or they wouldn’t be redistricting while it’s still 5 years to next census.

Not a great idea to listen to ‘"center-left, corporate and GOP donor-funded nonprofit", which advocates for neoliberal policies and is staunchly opposed to Medicare for All.’

en.m.wikipedia.org/wiki/Third_W...

August 23, 2025 at 9:31 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news