Lightnews — Scholar-powered news

Kwang Moo Yi

@kmyid.bsky.social

57 followers 39 following 81 posts

Assistant Professor of Computer Science at the University of British Columbia. I also post my daily finds on arxiv.

Posts Replies Media Videos

Kwang Moo Yi

@kmyid.bsky.social

Baek et al., "SONIC: Spectral Optimization of Noise for Inpainting with Consistency"

Initial seed noise matters. And you can optimize it **without** any backprop through your denoiser via good-ol linearization. Importantly, you need to do this in the Fourier space.

November 26, 2025 at 3:06 AM

Kwang Moo Yi

@kmyid.bsky.social

SAM 3D Team, "SAM 3D: 3Dfy Anything in Images"

image (point map) + mask -> Transformer -> pose / voxel -> transformer (image/mask/voxel) -> mesh / splat. Staged training with synthetic and real data & RL. I wish I could see more failure examples to know the limits.

November 21, 2025 at 8:35 PM

Kwang Moo Yi

@kmyid.bsky.social

Chen et al., "Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers"

Train a confidence predictor for tokens and merge low-confidence ones for acceleration -> faster reconstruction with VGGT/MapAnything.

November 19, 2025 at 8:40 PM

Kwang Moo Yi

@kmyid.bsky.social

Lin and Chen and Liew and Chen, et al., and Kang "Depth Anything 3: Recovering the Visual Space from Any Views"

VGGT-like architecture, but simplified to estimate depth and ray maps (not point maps). Uses teacher-student training of Depth Anything v2.

November 14, 2025 at 9:25 PM

Kwang Moo Yi

@kmyid.bsky.social

Singer and Rotstein et al., "Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising"

Make a rough warp, push it through Image-to-Video model with denoise together up until a timestep, then let it finish the rest without interference.

November 13, 2025 at 7:54 PM

Kwang Moo Yi

@kmyid.bsky.social

Ren and Wen et al., "FastGS: Training 3D Gaussian Splatting in 100 Seconds"

I like simple ideas -- this one says you should consider multiple views when you prune/clone, which allows fewer Gaussians to be used for training.

November 7, 2025 at 6:32 PM

Kwang Moo Yi

@kmyid.bsky.social

Gao and Mao et al., "Seeing the Wind from a Falling Leaf"

Extract Dynamic 3D Gaussians for an object -> Vision Language Models to extract physics parameters -> model force field (wind). Leads to some fun.

November 5, 2025 at 5:31 PM

Kwang Moo Yi

@kmyid.bsky.social

Zhou et al., "PAGE-4D: Disentangled Pose and Geometry Estimation for 4D Perception"

VGGT extended to dynamic scenes with a dynamic mask predictor.

November 4, 2025 at 8:17 PM

Kwang Moo Yi

@kmyid.bsky.social

Tesfaldet et al., "Generative Point Tracking with Flow Matching"

Tracking, waaaaaay back in the days, used to be solved using sampling methods. They are now back. Also reminds me of my first major conference work, where I looked into how much impact the initial target point has.

October 31, 2025 at 6:42 PM

Kwang Moo Yi

@kmyid.bsky.social

Bai et al., "Positional Encoding Field"

Make your RoPE encoding 3D by including a z axis, then manipulate your image by simply manipulating your positional encoding in 3D --> novel view synthesis. Neat idea.

October 24, 2025 at 6:20 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news