Shubhendu Trivedi
@shubhendu.bsky.social
830 followers 240 following 3.6K posts
Interests on bsky: ML research, applied math, and general mathematical and engineering miscellany. Also: Uncertainty, symmetry in ML, reliable deployment; applications in LLMs, computational chemistry/physics, and healthcare. https://shubhendu-trivedi.org
Posts Media Videos Starter Packs
Reposted by Shubhendu Trivedi
spmontecarlo.bsky.social
This one evaded my lists; quite cool stuff!

arxiv.org/abs/2507.00272
'Iteratively Saturated Kalman Filtering'
- Alan Yang, Stephen Boyd
Reposted by Shubhendu Trivedi
spmontecarlo.bsky.social
absolutely fascinating (standard Steinerberger fare); be sure to open this!

arxiv.org/abs/2510.11571
'Robust Online Sampling from Possibly Moving Target Distributions'
- François Clément, Stefan Steinerberger
shubhendu.bsky.social
... and that this circular wheeling-dealing circus is bad, but hardly the thing to worry about re: stock markets. There is a much deeper problem that will come out in a few years. The AI stuff is a wasteful distraction, comparatively. bsky.app/profile/shub...
shubhendu.bsky.social
So far these circular announcements function as "infinite money glitch" generators. The funniest was Oracle. It went up 45% in a single day based on some random capex guide using revenue that doesn't exist from OpenAI. Then it flipped into a sureshot short and retraced most of it in two weeks.
carlquintanilla.bsky.social
NVIDIA and OpenAi:

Concerns that their “increasingly complex and interconnected web of business transactions is artificially propping up the trillion-dollar AI boom.“

@bloomberg.com $NVDA 👀
www.bloomberg.com/news/feature...
shubhendu.bsky.social
I made a post earlier, but removed it because I (uncharacteristically) got self-conscious of some typo, but it went something like: I have googled Taylor Swift maybe twice (and for a while thought she was an actor), but I am sure her stock went up 43% after the NEWS.
shubhendu.bsky.social
But obviously, the question often is how to scale them.
Reposted by Shubhendu Trivedi
simonwillison.net
nanochat by Andrej Karpathy is neat - 8,000 lines of code (mostly Python, a tiny bit of Rust) that can train an LLM on $100 of rented cloud compute which can then be served with a web chat UI on a much smaller machine simonwillison.net/2025/Oct/13/...
nanochat
Really interesting new project from Andrej Karpathy, described at length in this discussion post. It provides a full ChatGPT-style LLM, including training, inference and a web Ui, that can be …
simonwillison.net
shubhendu.bsky.social
Yes!! I had to double check it was the same Michelin
shubhendu.bsky.social
Haha. I love them. This looks quite nice.
Reposted by Shubhendu Trivedi
shubhendu.bsky.social
One of my goals for the next few weeks is to pick up actively training for said system, using a RL/formal languages problem as a playground. In the near-worst case there will be an ε improvement, if nothing else. In the worst case, I will learn a bit more about myself.
shubhendu.bsky.social
I want to learn this sort of tempo rather than just admire it, because it seems trainable. He seems to have developed an admirable system of micro-iteration discipline (which ironically I actively pushed him towards when he was an undergrad, but it has been a while since I've been able to keep up).
shubhendu.bsky.social
I have this long-term collaborator who has this ability to iterate frighteningly fast, and within each iteration, especially if it concerns optimization, avoids lazy knobs. e.g. optimizing kernel bandwidths? Then of course you use Brent or golden-section search rather than punt to grid search or BO.
shubhendu.bsky.social
It is always persecution complex day.
shubhendu.bsky.social
I think your point was accurate, I was just making an aside, partly because the DoE is the largest funder of physical sciences research. However, DoE funding is much more directed e.g. will go directly go to specific labs, towards specific priorities.
shubhendu.bsky.social
I don't think the DoD bar absorbs it (and it should not, since the DoE reports separately, and has a separate command). Just tried to search for some numbers and it seems like it is in the ballpark used in the chart.
shubhendu.bsky.social
As an aside: The chart seems to skip the DoE. It funds almost all of the national labs (some, like LANL, also get funded by the DoD). If you include some specific programs that it handles (i.e. nuclear security admin research, ARPA-E), it spends over $21-22 billion a year.
shubhendu.bsky.social
The thought did cross my mind, but it seems to hold across translators?
shubhendu.bsky.social
it, often juxtaposing it with him not being able to control his bowels. Reading that felt somewhat odd. But later I learnt that it was symptomatic of a stomach cancer, of which he was not aware at the time. Over time you learn that this quality is an invariant throughout his writing.
shubhendu.bsky.social
There are always random asides, and instances that feel like needless oversharing, affected candor, or something similar. But there's always something to them. For example, in his tiny book 'An Armenian Sketchbook' he starts with describing a "titanic Stalin statue" in Yerevan, keeps talking about
shubhendu.bsky.social
PS: I hesitated in using "embarrassingly," because there's nothing embarrassing about that. But there's certainly this contrast throughout his writing, not sure how else to put it.
shubhendu.bsky.social
Vasily Grossman was an example of a writer who was not conventionally good (flat sentences, uneven structure, "embarrassingly" overblown metaphors), but nevertheless, he was not only compulsively readable, his writing was often luminous.