James Womack
jcwomack.bsky.social
James Womack
@jcwomack.bsky.social
{{ .Values.interestingTagline }}
I’ll be honest, though, it was the Peep Show reference in the article image that got me reading!
February 2, 2025 at 1:59 PM
3. Use tiered asynchronous checkpointing (DRAM -> node local storage -> shared storage) to avoid blocking GPUs
February 2, 2025 at 1:58 PM
2. A fast parallel distributed file system is not needed for training, adds unnecessary complexity to the system
February 2, 2025 at 1:58 PM
Interesting points I picked up

1. To keep GPUs busy in LLM training use node local SSDs to store training
February 2, 2025 at 1:58 PM
I used this to do an editable pip install of a src-layout Python package I was working on in the dev container, with all the development dependencies. Now when the dev container is launched, the Python package is already installed and ready to develop/test/debug! Very handy.
December 20, 2024 at 9:04 PM
A really nice feature is that dev containers make it very easy to do post-build customisation of a base image/Containerfile involving the source code being developed. You specify commands to run in the built container with your source code workspace (from the host) mounted into the container.
December 20, 2024 at 9:04 PM
Quite impressed. After reading the VSCode dev containers docs (code.visualstudio.com/docs/devcont...), it didn’t take long for me to get a containerised dev environment set up…
Developing inside a Container using Visual Studio Code Remote Development
Developing inside a Container using Visual Studio Code Remote Development
code.visualstudio.com
December 20, 2024 at 9:04 PM
An unordered list of interesting things I learned:

* How to write a ReFrame test
* What a roofline model is
* There are many novel/exotic hardware testbeds dotted around the UK (thanks to the ExCALIBUR H&ES programme)
* People really like to see and hold the GH200 and Grace CPU superchips
December 7, 2024 at 9:42 PM