Lightnews — Scholar-powered news

Chengzu

@chengzu-li.bsky.social

56 followers 100 following 9 posts

PhD student at Language Technology Lab, University of Cambridge

Posts Replies Media Videos

Chengzu

@chengzu-li.bsky.social

🔗 MVoT + CoT: New Ceiling for Reasoning

MVoT doesn’t replace CoT—it elevates it. By combining MVoT and CoT, the synergy of multimodal reasoning and verbal reasoning unlocks the performance upper bound, proving that two reasoning paradigms are potentially better than one!

January 14, 2025 at 2:50 PM

Chengzu

@chengzu-li.bsky.social

🎨 Revolutionizing Visual Reasoning with Token Discrepancy Loss

Messy visuals? Not anymore. Our token discrepancy loss ensures that MVoT generates accurate, meaningful visualizations with less redundancy.

Result? Better images, clearer reasoning, stronger performance.

January 14, 2025 at 2:50 PM

Chengzu

@chengzu-li.bsky.social

🎯 Performance Boosts with MVoT

MVoT isn’t just new—it’s better.
🔥 Better and more stable performance than CoT, particularly in complex scenarios like FrozenLake.
🌟 Plug-and-play power: Supercharges models like GPT-4o for unprecedented versatility.

January 14, 2025 at 2:50 PM

Chengzu

@chengzu-li.bsky.social

🧠MVoT

MVoT moves beyond Chain-of-Thought (CoT) to enable AI to imagine what it thinks with generated visual images. By blending verbal and visual reasoning, MVoT makes tackling complex problems more intuitive, interpretable, and powerful.

January 14, 2025 at 2:50 PM

Chengzu

@chengzu-li.bsky.social

Forget just thinking in words.

🔔Our New Preprint:
🚀 New Era of Multimodal Reasoning🚨
🔍 Imagine While Reasoning in Space with MVoT

Multimodal Visualization-of-Thought (MVoT) revolutionizes reasoning by generating visual "thoughts" that transform how AI thinks, reasons, and explains itself.

January 14, 2025 at 2:50 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news