Sung Kim
@sungkim.bsky.social
6.9K followers 1.1K following 4.9K posts
A business analyst at heart who enjoys delving into AI, ML, data engineering, data science, data analytics, and modeling. My views are my own. You can also find me at threads: @sung.kim.mw
Posts Media Videos Starter Packs
sungkim.bsky.social
Nanonets-OCR2 by Nanonets is a family of powerful, state-of-the-art image-to-markdown OCR models that go far beyond traditional text extraction. It transforms documents into structured markdown with intelligent content recognition and semantic tagging.

huggingface.co/nanonets/Nan...
nanonets/Nanonets-OCR2-3B · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
sungkim.bsky.social
Their approach achieves faster convergence without auxiliary representation alignment losses. Using a DiT variant equipped with a lightweight, wide DDT head, we achieve strong image generation results on ImageNet: 1.51 FID at 256x256 (no guidance) and 1.13 / 1.13 at 256x256 and 512x512 (guidance).
sungkim.bsky.social
Replace Variational Autoencoder (VAE) with pretrained representation encoders (e.g., DINO, SigLIP, MAE) paired with trained decoders, which they terms as Representation Autoencoders (RAE).
sungkim.bsky.social
So, China stopped buying soybeans from U.S. farmers because of Trump’s tariff policy and started buying them from Argentina instead.

Meanwhile, Trump’s administration supporting Argentina’s economy with $20 billion of U.S. taxpayers’ money - helping Argentine farmers export soybeans to China.
sungkim.bsky.social
New VR (or XR) is coming from Samsung. It will be announced on October 21, 2025 10:00 PM ET.

news.samsung.com/us/samsung-g...
sungkim.bsky.social
Ignore all those people telling you to switch to Linux or macOS. Remember, you don’t like change - and that’s perfectly fine.
sungkim.bsky.social
To everyone still using Windows 10 - keep using it for as long as you like. There are still plenty of people out there running Windows 7, 8, and even older versions.

You’re sticking with Windows 10 because you don’t like change, and honestly, no one’s going to convince you otherwise.
sungkim.bsky.social
Diffusion models are not truly serial models

Diffusion models are:
- Methodologically looks serial (step by step).
- But performing less like a truly serial model (autoregression).

They find that diffusion model solves each problem with the same convergence rate. It will never be a serial model.
sungkim.bsky.social
Pretraining with Hierarchical Memories

They propose dividing LLM parameters into 1) anchor (always used, capturing commonsense) and 2) memory bank (selected per query, capturing world knowledge).

Paper: arxiv.org/abs/2510.02375
sungkim.bsky.social
Meta released a paper on Hybrid RL

It offers a promising way to go beyond purely verifiable rewards - combining the reliability of verifier signals with the richness of learned feedback. The results are: +11.7 pts vs RM-only and +9.2 pts vs verifier-only on hard-to-verify reasoning tasks.
sungkim.bsky.social
Vuk Rosić trained 13 LLMs from 0% to 100% attention (rest being DeltaNet linear attention). He found out 17% attention (2 attn layers out of 12) was the best.

github.com/Open-Superin...
sungkim.bsky.social
"Install the Beads binary, tell your agent in AGENTS.md to stop using Markdown and run `bd quickstart`, and your agents will spontaneously get better at everything, particularly long-horizon planning and keeping track of newly discovered work."

github.com/steveyegge/b...
sungkim.bsky.social
@steve-yegge.bsky.social released Beads - A memory upgrade for your coding agent

"It is a magical 4-dimensional graph-based git-backed fairy-dusted issue-tracker database, designed to let coding agents track all your work and never get lost again."
sungkim.bsky.social
ByteDance released the FaceCLIP

A new vision-language model specializing in understanding and generating diverse human faces.

huggingface.co/ByteDance/Fa...
sungkim.bsky.social
@karpathy.bsky.social released the nanochat

"A minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM."
sungkim.bsky.social
→ 1 T total / 50 B active params · 128 K context window
→ Reinforced by Icepop RL + ASystem (Trillion-Scale RL Engine)
→ Open-source SOTA in natural language reasoning — AIME 25 / HMMT 25 / ARC-AGI-1 / CodeForce

huggingface.co/inclusionAI/...
sungkim.bsky.social
Alibaba Ant Group's Ring-1T

Alibaba Ant Group previously released the Ling-1T, which is non-thinking model. Now, it releases the Ring-1T, which is thinking model that achieves silver-level IMO reasoning through pure natural language reasoning.
sungkim.bsky.social
and (iii) higher inference efficiency, with a MIMO formulation that raises arithmetic intensity.

openreview.net/pdf?id=HwCva...
openreview.net