cudastoic.bsky.social
@cudastoic.bsky.social
Reposted
OpenAI's GPT OSS is still insanely underrated as a highly adopted open LLM. Downloads are out of control.
January 12, 2026 at 1:40 AM
Reposted
One of my favorite findings: Positional embeddings are just training wheels. They help convergence but hurt long-context generalization.

We found that if you simply delete them after pretraining and recalibrate for <1% of the original budget, you unlock massive context windows. Smarter, not harder.
Introducing DroPE: Extending Context by Dropping Positional Embeddings

We found embeddings like RoPE aid training but bottleneck long-sequence generalization. Our solution’s simple: treat them as a temporary training scaffold, not a permanent necessity.

arxiv.org/abs/2512.12167
pub.sakana.ai/DroPE
January 12, 2026 at 4:12 AM
Reposted
ZLUDA, now in its third iteration, has added support for CUDA 13.1 compatibility on non-NVIDIA GPUs (well… AMD GPUs).

- 1st iteration: Intel created ZLUDA as a drop-in replacement for CUDA on non-NVIDIA GPUs.
- 2nd iteration: AMD took over development after Intel dropped support.
January 12, 2026 at 2:56 AM
Reposted
Oh wow, deepseek is starting to make serious progress on LLMs that offload memory to external storage: github.com/deepseek-ai/...
github.com
January 12, 2026 at 6:44 PM