PSGD: MSE( Q.T Q H , I ) = 5.2e-3
Zero-Power NS 100 iterations: MSE( NS(G) , I ) = 8.2e-1
True Inverse: MSE( H^(-1/2) H H^(-1/2), I ) = 6.1e-3
PSGD whitens information significantly better than the Newton-Schulz iters found in Muon
PSGD: MSE( Q.T Q H , I ) = 5.2e-3
Zero-Power NS 100 iterations: MSE( NS(G) , I ) = 8.2e-1
True Inverse: MSE( H^(-1/2) H H^(-1/2), I ) = 6.1e-3
PSGD whitens information significantly better than the Newton-Schulz iters found in Muon
And two confident 4 rejects with a score of 1. And one borderline reject with a confidence of 4.
And two confident 4 rejects with a score of 1. And one borderline reject with a confidence of 4.
It's a great, small, and fully open VLM that I'm really excited about for fine-tuning and on-device use cases 💻
It also comes with 0-day MLX support via mlx-vlm, here's it running at > 80 tok/s on my M1 Max 🤯
It's a great, small, and fully open VLM that I'm really excited about for fine-tuning and on-device use cases 💻
It also comes with 0-day MLX support via mlx-vlm, here's it running at > 80 tok/s on my M1 Max 🤯
go.bsky.app/2qnppia
go.bsky.app/2qnppia
MARS is a new exciting variance reduction technique from @quanquangu.bsky.social 's group which can help stabilize and accelerate your deep learning pipeline. All that is needed is a gradient buffer. Here MARS speeds up the convergence of PSGD ultimately leading to a better solution.
MARS is a new exciting variance reduction technique from @quanquangu.bsky.social 's group which can help stabilize and accelerate your deep learning pipeline. All that is needed is a gradient buffer. Here MARS speeds up the convergence of PSGD ultimately leading to a better solution.
But RT isn't just for CTs. It's a sort of generalization of marginals in probability
RT g(p,θ): Shoot rays at θ+90 & offset p, measure line integrals of f(x,y) along the ray
1/n
But RT isn't just for CTs. It's a sort of generalization of marginals in probability
RT g(p,θ): Shoot rays at θ+90 & offset p, measure line integrals of f(x,y) along the ray
1/n
github.com/ethansmith20...
github.com/ethansmith20...
dev-discuss.pytorch.org/t/fsdp-cudac...
dev-discuss.pytorch.org/t/fsdp-cudac...