Suhaib Khan
banner
suhaibkhan.bsky.social
Suhaib Khan
@suhaibkhan.bsky.social
Interested in HPC & large storage systems
Reposted by Suhaib Khan
Exclusive: Amazon.com is joining Microsoft in supporting legislation that threatens to further limit Nvidia’s ability to export to China, a rare split between the chip designer and two of its biggest customers.
Amazon and Microsoft Back Effort That Would Restrict Nvidia’s Exports to China
The legislation in Washington would give tech leaders preferential access to chips at their data centers around the world.
on.wsj.com
November 14, 2025 at 7:27 AM
Reposted by Suhaib Khan
Exclusive: Samsung hikes memory chip prices by up to 60% as shortage worsens, sources say reut.rs/49QjvFI
Exclusive: Samsung hikes memory chip prices by up to 60% as shortage worsens, sources say
Samsung Electronics this month raised prices of certain memory chips - now in short supply due to the global race to build AI data centres - by as much as 60% compared to September, two people with knowledge of the hikes said.
reut.rs
November 14, 2025 at 9:00 AM
Reposted by Suhaib Khan
Very nice overview of the emerging UALink standard with nice features such as splitting packets in switches, in-network computing, high energy efficiency, and lowest silicon overhead: buff.ly/AgLvC1g

I'll be joining a panel at SC25 contrasting UALink and UEC next Wed: buff.ly/BeCMFcL Join us there
Introducing the UALink 200G 1.0 Specification Webinar
The Ultra Accelerator Link™ (UALink™) Consortium is an open industry standard group dedicated to advancing the UALink specification. The Consortium recently released the UALink 200G 1.0…
buff.ly
November 14, 2025 at 6:00 AM
DARPA’s Next-Generation Microelectronics Manufacturing (NGMM) program is building a packaging plant in Austin that is dedicated to 3D heterogeneous integration (3DHI).

spectrum.ieee.org/3d-heterogen...

@spectrum.ieee.org @darpa.mil
Why Is DARPA Betting on 3D Heterogeneous Integration?
Can a 1980s-era fab in Austin transform the future of microelectronics with 3D heterogeneous integration?
spectrum.ieee.org
November 13, 2025 at 10:14 PM
If you can read an analog clock correctly, you are still outperforming #AI in that regard.

spectrum.ieee.org/large-langua...

@spectrum.ieee.org
AI Struggles to Read Analog Clocks Correctly
AI struggles with analog clocks. What does this reveal about its limitations in image analysis?
spectrum.ieee.org
November 13, 2025 at 10:10 PM
Reposted by Suhaib Khan
Uhm, there is a typo in the headline, remove the "s" from insane. That should fix it.
November 13, 2025 at 12:06 AM
Reposted by Suhaib Khan
Paderborn's new Otus #supercomputer features 142,656 processor cores, including AMD “Turin” and #Nvidia H100 #GPUs, and 5PB of storage managed with IBM Spectrum Scale (formerly GPFS) file system. ow.ly/mnlX50XqGAs
‘Otus’ Now Open for Business at Germany's PC2 - HPCwire
The Paderborn Center for Parallel Computing (PC2) in Germany this week opened its newest and largest supercomputer for business. Otus, which sports more than 142,000 processor cores, will be used to r...
ow.ly
November 12, 2025 at 9:41 PM
Andrew Ng: #AI has stark limitations, and despite rapid improvements, it will remain limited compared to humans for a long time.

#AI is amazing, but it has unfortunately been hyped up to be even more amazing than it is.

www.deeplearning.ai/the-batch/is...
Safer (and Sexier) Chatbots, Better Images Through Reasoning, The Dawn of Industrial AI, and more...
The Batch AI News and Insights: I recently received an email titled “An 18-year-old’s dilemma: Too late to contribute to AI?” Its author, who gave...
www.deeplearning.ai
November 12, 2025 at 10:09 PM
Reposted by Suhaib Khan
Europe takes a major step in research connectivity! A new terabit network will link supercomputers across the continent, including EuroHPC’s @lumi-supercomputer.eu located in CSC’s data center in Kajaani 🚀

🔗 csc.fi/en/news/tera...
November 12, 2025 at 2:46 PM
Reposted by Suhaib Khan
Are 2030 AI hyperscalars capital constrained, power constrained, DRAM constrained, flash constrained, compute constrained, software constrained, or :-) demand constrained?
November 11, 2025 at 2:30 PM
Reposted by Suhaib Khan
Racks filled with GPUs and liquid cooling gear can now weigh 6,000 pounds or more, requiring new approaches to address human safety and investment protection. Google, Meta, and Microsoft are turning to robotics to safely move these huge racks.
open.substack.com/pub/datacent...
Data Centers Turn to Robots to Haul Multi-Ton Racks
Hyperscalers, OCP Ramp Up Robotics Teams for Worker Safety, Productivity
open.substack.com
November 11, 2025 at 12:47 PM
Reposted by Suhaib Khan
Scammers have a new way of getting into your pockets: by targeting your #AI assistant. They use prompt engineering, embedding code in emails that trick AI tools into taking malicious actions. Learn how to protect your digital presence. spectrum.ieee.org/ai-agent-phi...
November 9, 2025 at 4:01 PM
Reposted by Suhaib Khan
Can we build an #AI #Climate Scientist? Asked at the ADIA Lab Symposium in Abu Dhabi last week - now online at buff.ly/6igSeyg :-).

Much work to be done - this is outlining some directions of indicative results with a lot of potential to accelerate AI for Science.
November 9, 2025 at 9:24 AM
Reposted by Suhaib Khan
AI excels in complex tasks but falters at reading analog clocks—what does this tell us about its limitations?
AI Struggles to Read Analog Clocks Correctly
AI struggles with analog clocks. What does this reveal about its limitations in image analysis?
spectrum.ieee.org
November 8, 2025 at 2:01 PM
Reposted by Suhaib Khan
Nvidia's biggest scale up domain is 72 GPUs. Google's is 9,216 TPUs.

Historically TPUs have trailed on FLOPS, memory, & bandwidth. That's no longer the case with Ironwood.

Google has a Blackwell-class TPU with absurd scale. More on @theregister.com ⬇️

www.theregister.com/2025/11/06/g...
TPU v7, Google's answer to Nvidia's Blackwell is nearly here
: Chocolate Factory's homegrown silicon boasts Blackwell-level perf at massive scale
www.theregister.com
November 7, 2025 at 4:16 PM
Reposted by Suhaib Khan
Exclusive: Intel is losing a data center AI executive who previously helped lead the company’s Gaudi accelerator chip efforts and is now headed for a job at AMD, CRN has learned. www.crn.com/news/compone...
Exclusive: Intel Is Losing A Data Center AI Executive To AMD
Intel is losing a data center AI executive who previously helped led the company’s Gaudi accelerator chip efforts and is now headed for a job at AMD, CRN has learned.
www.crn.com
November 6, 2025 at 9:04 PM
Reposted by Suhaib Khan
Collaborator and friend Dan Alistarh talks at ETH about using the new NvFP4 and MXFP4 block formats for inference.

Some going from "terrible" accuracy to acceptable using micro rotations to smoothen outliers in blocks.

arxiv.org/abs/2509.23202

Great collaboration and cool stuff
November 5, 2025 at 8:32 AM
Reposted by Suhaib Khan
Google recently posted a promo for using their managed Lustre service to accelerate inferencing via KV caching. Raises questions:

1. What ever happened to Google Managed DAOS (ParallelStore)? It performs better than Lustre.

2. Does Gemini use this? Unlikely. See glennklockwood.com/garden/atten...
attention
Attention is the mathematical operation within a transformer that allows different parts of the input to figure out how important they are to each other ...
glennklockwood.com
November 4, 2025 at 4:38 PM
OpenAI spreads the imaginary wealth beyond Microsoft with $38B AWS deal

Amazon deal still dwarfed by $250B Azure commitment made as part of OpenAI's for-profit transformation

www.theregister.com/2025/11/03/o...
OpenAI signs $38B cloud computing deal with AWS
: Amazon deal still dwarfed by $250B Azure commitment made as part of OpenAI's for-profit transformation
www.theregister.com
November 3, 2025 at 6:56 PM
Reposted by Suhaib Khan
Silicon Valley’s biggest companies are already planning to pour $400 billion into artificial intelligence efforts this year. They all say it’s nowhere near enough.
Big Tech Is Spending More Than Ever on AI and It’s Still Not Enough
Meta, Alphabet, Microsoft and Amazon have all said they will increase spending in 2026. But investors have given mixed signals.
on.wsj.com
October 31, 2025 at 11:18 AM
Reposted by Suhaib Khan
The largest hyperscale operators say demand for AI services is filling data centers as fast as they can build them, with several saying they are compute-constrained.
As a result, they expect to build even more data center space in 2026.

datacenterrichness.substack.com/p/hyperscale...
Hyperscale Building Boom Poised to Continue
Microsoft, Google, Meta and AWS Describe Strong Demand for New Services
datacenterrichness.substack.com
October 31, 2025 at 12:35 PM
Reposted by Suhaib Khan
Each time a new AI training benchmark is introduced, the fastest training time gets longer. Then, hardware improvements gradually bring the execution time down, only to get thwarted again by the next benchmark. Then the cycle repeats itself.
AI Model Growth Outpaces Hardware Improvements
AI training races are heating up as benchmarks get tougher.
spectrum.ieee.org
October 30, 2025 at 5:35 PM
Diamond Blankets Will Keep Future Chips Cool

Growing a micrometers-thick layer of diamond inside advanced chips spreads out the heat and drops the temperature more than 50°C.

spectrum.ieee.org/diamond-ther...

@spectrum.ieee.org
Can Diamonds Solve the Chip Heat Dilemma?
Stanford's diamond innovation could redefine chip cooling, making electronics more efficient and powerful.
spectrum.ieee.org
October 30, 2025 at 5:17 PM
John Shalf @cs.lbl.gov to receive the 2025 IEEE Seymour Cray Computer Engineering Award

www.computer.org/profiles/joh...

#SC25 #HPC
John Shalf
John Shalf is the Department Head for Computer Science at Lawrence Berkeley National Laboratory. He also formerly served as the Deputy Director for Hardware Technology on the US Department of Energy (...
www.computer.org
October 29, 2025 at 5:20 PM
NVIDIA and Oracle to Build US Department of Energy’s Largest AI Supercomputer for Scientific Discovery

nvidianews.nvidia.com/news/nvidia-...
NVIDIA and Oracle to Build US Department of Energy’s Largest AI Supercomputer for Scientific Discovery
NVIDIA today announced a landmark collaboration with Oracle to build the U.S. Department of Energy (DOE)’s largest AI supercomputer to dramatically accelerate scientific discovery.
nvidianews.nvidia.com
October 29, 2025 at 4:15 PM