Lightnews — Scholar-powered news

Sung Kim @sungkim.bsky.social · 4h

I guess Intel’s strategy is to take a consumer GPU and just slap on a big a** memory. I kinda like it. 👍

www.tomshardware.com/pc-component...

Intel unveils Crescent Island, an inference-only GPU with Xe3P architecture and 160GB of memory

And a mysterious configuration.

www.tomshardware.com

1 2 13

Sung Kim @sungkim.bsky.social · 4h

Nanonets-OCR2 by Nanonets is a family of powerful, state-of-the-art image-to-markdown OCR models that go far beyond traditional text extraction. It transforms documents into structured markdown with intelligent content recognition and semantic tagging.

huggingface.co/nanonets/Nan...

nanonets/Nanonets-OCR2-3B · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

1 1 12

Sung Kim @sungkim.bsky.social · 5h

paper: arxiv.org/abs/2510.11690
blog: rae-dit.github.io
code: github.com/bytetriper/RAE

1

Sung Kim @sungkim.bsky.social · 5h

Their approach achieves faster convergence without auxiliary representation alignment losses. Using a DiT variant equipped with a lightweight, wide DDT head, we achieve strong image generation results on ImageNet: 1.51 FID at 256x256 (no guidance) and 1.13 / 1.13 at 256x256 and 512x512 (guidance).

1

Sung Kim @sungkim.bsky.social · 5h

Replace Variational Autoencoder (VAE) with pretrained representation encoders (e.g., DINO, SigLIP, MAE) paired with trained decoders, which they terms as Representation Autoencoders (RAE).

1 1 7

Sung Kim @sungkim.bsky.social · 5h

LMSYS.ORG's NVIDIA DGX Spark In-Depth Review: A New Standard for Local AI Inference

TLDR. It’s built to bring AI experimentation to your desk, albeit very slowly.

lmsys.org/blog/2025-10...

NVIDIA DGX Spark In-Depth Review: A New Standard for Local AI Inference | LMSYS Org

<p>Thanks to NVIDIA’s early access program, we are thrilled to get our hands on the NVIDIA DGX™ Spark. It’s quite an unconventional system, as NVIDIA rarely ...

lmsys.org

1 2 16

Sung Kim @sungkim.bsky.social · 6h

So, China stopped buying soybeans from U.S. farmers because of Trump’s tariff policy and started buying them from Argentina instead.

Meanwhile, Trump’s administration supporting Argentina’s economy with $20 billion of U.S. taxpayers’ money - helping Argentine farmers export soybeans to China.

1 9

Sung Kim @sungkim.bsky.social · 8h

New VR (or XR) is coming from Samsung. It will be announced on October 21, 2025 10:00 PM ET.

news.samsung.com/us/samsung-g...

1 3

Sung Kim @sungkim.bsky.social · 14h

Ignore all those people telling you to switch to Linux or macOS. Remember, you don’t like change - and that’s perfectly fine.

Sung Kim @sungkim.bsky.social · 14h

To everyone still using Windows 10 - keep using it for as long as you like. There are still plenty of people out there running Windows 7, 8, and even older versions.

You’re sticking with Windows 10 because you don’t like change, and honestly, no one’s going to convince you otherwise.

3 6

Sung Kim @sungkim.bsky.social · 1d

Blog: yuxi.ml/essays/posts...
Paper: The Serial Scaling Hypothesis ( arxiv.org/abs/2507.12549 )

Perfect diffusion is TC0 – Bad diffusion is Turing-complete – Yuxi on the Wired

An application of computational complexity theory to diffusion language models. Unlikely to be useful for anything, but it is cute!

yuxi.ml

1 2

Sung Kim @sungkim.bsky.social · 1d

Diffusion models are not truly serial models

Diffusion models are:
- Methodologically looks serial (step by step).
- But performing less like a truly serial model (autoregression).

They find that diffusion model solves each problem with the same convergence rate. It will never be a serial model.

1 4 23

Sung Kim @sungkim.bsky.social · 1d

Pretraining with Hierarchical Memories

They propose dividing LLM parameters into 1) anchor (always used, capturing commonsense) and 2) memory bank (selected per query, capturing world knowledge).

Paper: arxiv.org/abs/2510.02375

2 8

Sung Kim @sungkim.bsky.social · 1d

Paper: arxiv.org/abs/2510.07242

1

Sung Kim @sungkim.bsky.social · 1d

Meta released a paper on Hybrid RL

It offers a promising way to go beyond purely verifiable rewards - combining the reliability of verifier signals with the richness of learned feedback. The results are: +11.7 pts vs RM-only and +9.2 pts vs verifier-only on hard-to-verify reasoning tasks.

1 1 16

Sung Kim @sungkim.bsky.social · 1d

Vuk Rosić trained 13 LLMs from 0% to 100% attention (rest being DeltaNet linear attention). He found out 17% attention (2 attn layers out of 12) was the best.

github.com/Open-Superin...

9

Sung Kim @sungkim.bsky.social · 1d

I hope this helps.

1 1 5

Sung Kim @sungkim.bsky.social · 1d

"Install the Beads binary, tell your agent in AGENTS.md to stop using Markdown and run `bd quickstart`, and your agents will spontaneously get better at everything, particularly long-horizon planning and keeping track of newly discovered work."

github.com/steveyegge/b...

1 8

Sung Kim @sungkim.bsky.social · 1d

@steve-yegge.bsky.social released Beads - A memory upgrade for your coding agent

"It is a magical 4-dimensional graph-based git-backed fairy-dusted issue-tracker database, designed to let coding agents track all your work and never get lost again."

2 2 24

Sung Kim @sungkim.bsky.social · 1d

ByteDance released the FaceCLIP

A new vision-language model specializing in understanding and generating diverse human faces.

huggingface.co/ByteDance/Fa...

1 17

Sung Kim @sungkim.bsky.social · 1d

Blog: github.com/karpathy/nan...
Repo: github.com/karpathy/nan...

10

Sung Kim @sungkim.bsky.social · 1d

@karpathy.bsky.social released the nanochat

"A minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM."

2 2 30

Sung Kim @sungkim.bsky.social · 1d

→ 1 T total / 50 B active params · 128 K context window
→ Reinforced by Icepop RL + ASystem (Trillion-Scale RL Engine)
→ Open-source SOTA in natural language reasoning — AIME 25 / HMMT 25 / ARC-AGI-1 / CodeForce

huggingface.co/inclusionAI/...

Sung Kim @sungkim.bsky.social · 1d

Alibaba Ant Group's Ring-1T

Alibaba Ant Group previously released the Ling-1T, which is non-thinking model. Now, it releases the Ring-1T, which is thinking model that achieves silver-level IMO reasoning through pure natural language reasoning.

1 2 8

Sung Kim @sungkim.bsky.social · 1d

and (iii) higher inference efficiency, with a MIMO formulation that raises arithmetic intensity.

openreview.net/pdf?id=HwCva...

openreview.net

4