Lightnews — Scholar-powered news

@cudastoic.bsky.social

2 followers 13 following 0 posts

Posts Replies Media Videos

Reposted

Nathan Lambert

@natolambert.bsky.social

OpenAI's GPT OSS is still insanely underrated as a highly adopted open LLM. Downloads are out of control.

January 12, 2026 at 1:40 AM

Reposted

hardmaru

@hardmaru.bsky.social

One of my favorite findings: Positional embeddings are just training wheels. They help convergence but hurt long-context generalization.

We found that if you simply delete them after pretraining and recalibrate for <1% of the original budget, you unlock massive context windows. Smarter, not harder.

sakanaai.bsky.social @sakanaai.bsky.social · 11d

Introducing DroPE: Extending Context by Dropping Positional Embeddings

We found embeddings like RoPE aid training but bottleneck long-sequence generalization. Our solution’s simple: treat them as a temporary training scaffold, not a permanent necessity.

arxiv.org/abs/2512.12167
pub.sakana.ai/DroPE

January 12, 2026 at 4:12 AM

Reposted

Sung Kim

@sungkim.bsky.social

ZLUDA, now in its third iteration, has added support for CUDA 13.1 compatibility on non-NVIDIA GPUs (well… AMD GPUs).

- 1st iteration: Intel created ZLUDA as a drop-in replacement for CUDA on non-NVIDIA GPUs.
- 2nd iteration: AMD took over development after Intel dropped support.

January 12, 2026 at 2:56 AM

Reposted

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Oh wow, deepseek is starting to make serious progress on LLMs that offload memory to external storage: github.com/deepseek-ai/...

github.com

January 12, 2026 at 6:44 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news