Blog: https://nishtahir.com
Mastodon: social.nishtahir.com/@nish
fortune.com/2026/02/13/t...
fortune.com/2026/02/13/t...
It's ~60 hours into training and is starting to become coherent. Tokens are beginning to attend to tokens behind them as you go deeper through its layers
It's ~60 hours into training and is starting to become coherent. Tokens are beginning to attend to tokens behind them as you go deeper through its layers
huggingface.co/spaces/nisht...
huggingface.co/spaces/nisht...
The paper focuses on the impacts of AI on skill acquisition and formation. They focus on software engineering but the learnings should apply to other domains as well.
The paper focuses on the impacts of AI on skill acquisition and formation. They focus on software engineering but the learnings should apply to other domains as well.
pub.towardsai.net/hundreds-of-...
pub.towardsai.net/hundreds-of-...
I tested out the 8b variant, and it seems to be doing stuff. 32b gave me a bunch of trouble because of context window limits.
I tested out the 8b variant, and it seems to be doing stuff. 32b gave me a bunch of trouble because of context window limits.
Most of the demos i'm seeing are the same Telegram/OpenTable connections as when MCP first rolled out.
Is this just more hype, or am I missing something?
Most of the demos i'm seeing are the same Telegram/OpenTable connections as when MCP first rolled out.
Is this just more hype, or am I missing something?
So why this is useful is because of how LLMs represent information internally. A simplified way to build some intuition is to think in terms of a latent space.
So why this is useful is because of how LLMs represent information internally. A simplified way to build some intuition is to think in terms of a latent space.
LLMs are not compilers. With a compiler, you provide a highly detailed spec and know exactly what you are getting.
LLMs are not compilers. With a compiler, you provide a highly detailed spec and know exactly what you are getting.
AI coverage tends to have a positive outcome self-selection bias. Social media coverage shows the positives without showing the hundreds of attempts and failed experiments.
AI coverage tends to have a positive outcome self-selection bias. Social media coverage shows the positives without showing the hundreds of attempts and failed experiments.
There's a lot of opportunity for semantic content matching, not unlike Google - ask about mobile games - get ads for sponsored games. It'll be interesting to see how ads evolve on the platform.
www.cnbc.com/2026/01/16/o...
There's a lot of opportunity for semantic content matching, not unlike Google - ask about mobile games - get ads for sponsored games. It'll be interesting to see how ads evolve on the platform.
www.cnbc.com/2026/01/16/o...
No, I don't want you to summarize my already summarized notification summaries.
No, I don't want you to summarize my already summarized notification summaries.
I love to see people experimenting with new stuff however the hype cycle does what the hype cycle does. Here is what I think people have been getting wrong.
I love to see people experimenting with new stuff however the hype cycle does what the hype cycle does. Here is what I think people have been getting wrong.
From what I can tell, they are stitching together LlamaFactor, Ray, and vLLM into one CLI that they support for use with their products.
www.razer.com/newsroom/pro...
From what I can tell, they are stitching together LlamaFactor, Ray, and vLLM into one CLI that they support for use with their products.
www.razer.com/newsroom/pro...
github.com/nishtahir/mi...
github.com/nishtahir/mi...
It's worth noting that Claude code is a new project driven by a very senior engineer, with are reasonably well-defined scope that has evolved over time.
It's worth noting that Claude code is a new project driven by a very senior engineer, with are reasonably well-defined scope that has evolved over time.