Nawaf Alageel
nawafalageel.bsky.social
Nawaf Alageel
@nawafalageel.bsky.social
Trying to teach computers how to see through math.
Now, instead of guessing or jumping through hoops to find the answer, we can see the tool can tell us:
"Container X is occupied 12GB on GPU #1 with Y memory utilization"

We went from blindfolded resource to actual insight.

And our question is finally answered 🥳🎉
July 15, 2025 at 11:39 AM
- Nvidia tools (e.g., nvidia-smi) show processes, but not container names.
- Docker tools (e.g, docker status) show CPU and memory, but no GPU data.

We would still be blindfolded. And our question is not answered yet!
July 15, 2025 at 11:39 AM
When it comes to monitoring GPU usage in containerized environment, Nvidia and Docker both of them provide good out-of-the-box tools, but they aren't compatible.

None of them can answer my simple question:
"Which container uses which GPU?"
July 15, 2025 at 11:39 AM
If you're training ML models with Docker containers and Nvidia GPUs, especially on-prem, you've likely seen wasted compute.

GPUs sit idle while its occupies memory (aka. VRAM), but without observation tools, that leads to poor utilization and waised compute.

#Nvidia #GPU #Docker #MachineLearning
July 15, 2025 at 11:39 AM