Mohit
banner
mohitmayank.com
Mohit
@mohitmayank.com
The AI Guy | Helping Startups 10x AI Game 🤖 | Author “Lazy Data Science Guide” | Creator of Jaal
Perfect Sunday to dive into something that still confuses a lot of people - what exactly does prompt caching cache?

Some assume it cached the model's input and responses. Makes sense, right? Same prompt = same answer, stored somewhere.

Wrong.
December 21, 2025 at 6:17 AM
If you are building Agentic AI, Google's FunctionGemma 270M model can save you lots of 🤕 and 💸

Model: huggingface.co/google/func...
Blog: blog.google/technology/...
December 19, 2025 at 11:34 AM
Your AI pipeline is only as fast as your slowest transformation

CocoIndex is a data transformation framework built for AI workloads with its core engine written in Rust. If you're building AI pipelines and tired of slow data transformation, worth checking out.

github.com/cocoindex-i...
December 18, 2025 at 8:29 AM
Gemini 3 Flash just killed the speed vs intelligence tradeoff discussion

Frontier-level performance (90.4% on GPQA Diamond, 33.7% on Humanity's Last Exam) but 3x faster than 2.5 Pro and at less than 1/4th the cost of 3 Pro.

The best part? It's available for free on Google Antigravity!
December 18, 2025 at 6:16 AM
Want to understand how PyTorch actually works under the hood? TinyTorch lets you build it from scratch.

It's a hands-on ML systems project that takes you through 20 modules - from implementing basic Tensor operations to building complete training pipelines with checkpointing, gradient clipping, etc
December 17, 2025 at 6:17 AM
vLLM's model routing is much more than simple domain classification
December 17, 2025 at 3:32 AM
Training a TTS model doesn't have to be complex - LLM + Neural Codec approach makes it surprisingly straightforward

1/4
December 15, 2025 at 11:33 AM
Apple's latest paper once again shows that sometimes the best solutions aren't the most complex ones.
December 15, 2025 at 6:11 AM
TIL there is a Poker course by Johns Hopkins and it has helped people win tournaments!

youtu.be/7cAzgUIKI68
December 13, 2025 at 6:19 AM
More agents ≠ better results

Google Research just released a comprehensive study on scaling agent systems, and the findings challenge the "just add more agents" narrative.

Key findings across 180 configurations:
December 11, 2025 at 11:27 AM
LangChain just dropped a complete guide on building Voice Agents in JavaScript!

Two main architectures for voice AI:
1. The Sandwich - STT → Agent → TTS
2. Speech-to-Speech (S2S)

LangChain's demo uses the sandwich approach - achieving sub-700ms latency while maintaining modularity.
December 10, 2025 at 11:30 AM
Before you upgrade to a pricier LLM, please make sure you're optimizing your context management.

Manus uses 50 tool calls per session on average. Without context engineering, the context window fills up and performance tanks.

#1
December 10, 2025 at 6:11 AM
This will make you rethink AI agent eval
December 8, 2025 at 11:34 AM
Every new model claims to be state-of-the-art. Hugging Face released a guidebook to help you think critically about those claims.

The guide covers everything from basic metrics to advanced benchmarking strategies, including what evaluation can (and critically, can't) tell you about your models.
December 8, 2025 at 8:26 AM
Microsoft just dropped VibeVoice-Realtime-0.5B - a 500M parameter TTS model that produces speech in ~300ms from first token input.

Try it here: huggingface.co/spaces/anyc...
December 8, 2025 at 6:11 AM
GPT-5.1 Codex Max is free on Cursor till December 11th - 272k context, coding-focused, sounds perfect right?

Not quite. It's making silly mistakes, struggling with complex tasks, and sometimes even failing at simpler ones. Honestly, Composer 1 still feels much better.
December 5, 2025 at 11:30 AM
Visualizing networks shouldn't require 100 lines of code. Jaal does it in 2.

Built a Python library that turns graph data into an interactive dashboard - search nodes, filter by features, color by attributes, all in a clean web interface.

GitHub: github.com/imohitmayan...
December 5, 2025 at 6:19 AM
Evolution of code LLMs
December 4, 2025 at 8:28 AM
A 300+ page deep dive into Code LLMs - from foundation models to autonomous agents.

Read: arxiv.org/pdf/2511.18538
December 4, 2025 at 6:14 AM
Frontier AI models can be used to exploit blockchain smart contracts!

Anthropic's new SCONE-bench tested models on 405 actual smart contract vulnerabilities from 2020-2025. Result? Top models cracked 19 out of 34 contracts exploited after March 2025 - contracts they'd never seen before!

#1
December 3, 2025 at 4:53 AM
Ever watched your AI agent struggle with a 10-step task? LangChain just dropped DeepAgents - a standalone library for building agents that can actually handle complex, multi-step workflows.

docs.langchain.com/oss/python/...
December 2, 2025 at 8:28 AM
100 mins of Stanford lecture on Agents, Prompts and RAG - available on YT!

The latest Stanford CS230 lecture breaks down the core techniques that differentiate robust AI products from simple prototypes.

Watch it here: www.youtube.com/watch?v=k1n...
December 2, 2025 at 6:13 AM
Instead of using a big model, use smaller models to manage bigger ones!

Nvidia's "ToolOrchestra" features an 8B orchestrator that determines when to use specialized tools and more powerful models, rather than relying on one massive model for everything.

Paper: arxiv.org/pdf/2511.21689
December 1, 2025 at 4:36 AM
The way you manage context can make or break your AI agent.

Here is an excellent article on the topic by Anthropic www.anthropic.com/engineering...
November 29, 2025 at 8:29 AM
How is this free? A 12-hour CUDA course taking you from beginner to expert!

If you want to understand what actually happens under the hood when you run model.fit(), this 12-hour CUDA course is exactly what you need.

Watch: www.youtube.com/watch?v=86F...
November 28, 2025 at 6:28 AM