Lightnews — Scholar-powered news

Mohit

@mohitmayank.com

Perfect Sunday to dive into something that still confuses a lot of people - what exactly does prompt caching cache?

Some assume it cached the model's input and responses. Makes sense, right? Same prompt = same answer, stored somewhere.

Wrong.

December 21, 2025 at 6:17 AM

Mohit

@mohitmayank.com

If you are building Agentic AI, Google's FunctionGemma 270M model can save you lots of 🤕 and 💸

Model: huggingface.co/google/func...
Blog: blog.google/technology/...

December 19, 2025 at 11:34 AM

Mohit

@mohitmayank.com

Your AI pipeline is only as fast as your slowest transformation

CocoIndex is a data transformation framework built for AI workloads with its core engine written in Rust. If you're building AI pipelines and tired of slow data transformation, worth checking out.

github.com/cocoindex-i...

December 18, 2025 at 8:29 AM

Mohit

@mohitmayank.com

Gemini 3 Flash just killed the speed vs intelligence tradeoff discussion

Frontier-level performance (90.4% on GPQA Diamond, 33.7% on Humanity's Last Exam) but 3x faster than 2.5 Pro and at less than 1/4th the cost of 3 Pro.

The best part? It's available for free on Google Antigravity!

December 18, 2025 at 6:16 AM

Mohit

@mohitmayank.com

Want to understand how PyTorch actually works under the hood? TinyTorch lets you build it from scratch.

It's a hands-on ML systems project that takes you through 20 modules - from implementing basic Tensor operations to building complete training pipelines with checkpointing, gradient clipping, etc

December 17, 2025 at 6:17 AM

Mohit

@mohitmayank.com

vLLM's model routing is much more than simple domain classification

December 17, 2025 at 3:32 AM

Mohit

@mohitmayank.com

Training a TTS model doesn't have to be complex - LLM + Neural Codec approach makes it surprisingly straightforward

1/4

December 15, 2025 at 11:33 AM

Mohit

@mohitmayank.com

Apple's latest paper once again shows that sometimes the best solutions aren't the most complex ones.

December 15, 2025 at 6:11 AM

Mohit

@mohitmayank.com

TIL there is a Poker course by Johns Hopkins and it has helped people win tournaments!

youtu.be/7cAzgUIKI68

December 13, 2025 at 6:19 AM

Mohit

@mohitmayank.com

More agents ≠ better results

Google Research just released a comprehensive study on scaling agent systems, and the findings challenge the "just add more agents" narrative.

Key findings across 180 configurations:

December 11, 2025 at 11:27 AM

Mohit

@mohitmayank.com

LangChain just dropped a complete guide on building Voice Agents in JavaScript!

Two main architectures for voice AI:
1. The Sandwich - STT → Agent → TTS
2. Speech-to-Speech (S2S)

LangChain's demo uses the sandwich approach - achieving sub-700ms latency while maintaining modularity.

December 10, 2025 at 11:30 AM

Mohit

@mohitmayank.com

Before you upgrade to a pricier LLM, please make sure you're optimizing your context management.

Manus uses 50 tool calls per session on average. Without context engineering, the context window fills up and performance tanks.

#1

December 10, 2025 at 6:11 AM

Mohit

@mohitmayank.com

This will make you rethink AI agent eval

December 8, 2025 at 11:34 AM

Mohit

@mohitmayank.com

Every new model claims to be state-of-the-art. Hugging Face released a guidebook to help you think critically about those claims.

The guide covers everything from basic metrics to advanced benchmarking strategies, including what evaluation can (and critically, can't) tell you about your models.

December 8, 2025 at 8:26 AM

Mohit

@mohitmayank.com

Microsoft just dropped VibeVoice-Realtime-0.5B - a 500M parameter TTS model that produces speech in ~300ms from first token input.

Try it here: huggingface.co/spaces/anyc...

December 8, 2025 at 6:11 AM

Mohit

@mohitmayank.com

GPT-5.1 Codex Max is free on Cursor till December 11th - 272k context, coding-focused, sounds perfect right?

Not quite. It's making silly mistakes, struggling with complex tasks, and sometimes even failing at simpler ones. Honestly, Composer 1 still feels much better.

December 5, 2025 at 11:30 AM

Mohit

@mohitmayank.com

Visualizing networks shouldn't require 100 lines of code. Jaal does it in 2.

Built a Python library that turns graph data into an interactive dashboard - search nodes, filter by features, color by attributes, all in a clean web interface.

GitHub: github.com/imohitmayan...

December 5, 2025 at 6:19 AM

Mohit

@mohitmayank.com

Evolution of code LLMs

December 4, 2025 at 8:28 AM

Mohit

@mohitmayank.com

A 300+ page deep dive into Code LLMs - from foundation models to autonomous agents.

Read: arxiv.org/pdf/2511.18538

December 4, 2025 at 6:14 AM

Mohit

@mohitmayank.com

Frontier AI models can be used to exploit blockchain smart contracts!

Anthropic's new SCONE-bench tested models on 405 actual smart contract vulnerabilities from 2020-2025. Result? Top models cracked 19 out of 34 contracts exploited after March 2025 - contracts they'd never seen before!

#1

December 3, 2025 at 4:53 AM

Mohit

@mohitmayank.com

Ever watched your AI agent struggle with a 10-step task? LangChain just dropped DeepAgents - a standalone library for building agents that can actually handle complex, multi-step workflows.

docs.langchain.com/oss/python/...

December 2, 2025 at 8:28 AM

Mohit

@mohitmayank.com

100 mins of Stanford lecture on Agents, Prompts and RAG - available on YT!

The latest Stanford CS230 lecture breaks down the core techniques that differentiate robust AI products from simple prototypes.

Watch it here: www.youtube.com/watch?v=k1n...

December 2, 2025 at 6:13 AM

Mohit

@mohitmayank.com

Instead of using a big model, use smaller models to manage bigger ones!

Nvidia's "ToolOrchestra" features an 8B orchestrator that determines when to use specialized tools and more powerful models, rather than relying on one massive model for everything.

Paper: arxiv.org/pdf/2511.21689

December 1, 2025 at 4:36 AM

Mohit

@mohitmayank.com

The way you manage context can make or break your AI agent.

Here is an excellent article on the topic by Anthropic www.anthropic.com/engineering...

November 29, 2025 at 8:29 AM

Mohit

@mohitmayank.com

How is this free? A 12-hour CUDA course taking you from beginner to expert!

If you want to understand what actually happens under the hood when you run model.fit(), this 12-hour CUDA course is exactly what you need.

Watch: www.youtube.com/watch?v=86F...

November 28, 2025 at 6:28 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news