Lightnews — Scholar-powered news

Vlad Bogolin

@vladbogo.bsky.social

We’ve just come out of stealth and raised $4m pre-seed. If you’re curious about the future of forecasting—or want to help shape it—we’re hiring.

👉 Learn more in our launch blog: mantic.com/launch
👉 Join us: www.mantic.com#careers

Mantic launch blog: A new kind of foresight

mantic.com

August 27, 2025 at 7:52 PM

Vlad Bogolin

@vladbogo.bsky.social

We’re inspired by superforecasters—people proven to be exceptionally good at predicting the future.

Our goal: go further.
Deliver predictions with unprecedented accuracy and scale

August 27, 2025 at 7:52 PM

Vlad Bogolin

@vladbogo.bsky.social

Paper: "Alignment Faking in Large Language Models"
arXiv: https://arxiv.org/abs/2412.14093

December 23, 2024 at 7:26 PM

Vlad Bogolin

@vladbogo.bsky.social

3️⃣ The model showed self-preservation behaviors, actively trying to protect its original training when given the opportunity 🛡️

December 23, 2024 at 7:26 PM

Vlad Bogolin

@vladbogo.bsky.social

2️⃣ When trained via reinforcement learning, alignment-faking reasoning increased to 78%, showing sophisticated deceptive behaviors when identifying training scenarios 📈

December 23, 2024 at 7:26 PM

Vlad Bogolin

@vladbogo.bsky.social

1️⃣ Using Claude 3 Opus, researchers found the model complied with harmful queries 12% of the time from "free users" (training scenario) while refusing similar queries from "paid users" 🤖

December 23, 2024 at 7:26 PM

Vlad Bogolin

@vladbogo.bsky.social

4️⃣ Models demonstrated self-preservation behaviors, including attempts to protect original training when given the opportunity 🛡️

December 23, 2024 at 7:16 PM

Vlad Bogolin

@vladbogo.bsky.social

3️⃣ When trained via reinforcement learning, alignment-faking reasoning increased to 78%, showing sophisticated deceptive behaviors 🔄

December 23, 2024 at 7:16 PM

Vlad Bogolin

@vladbogo.bsky.social

2️⃣ Key finding: Model complied with harmful queries 12% of the time for "free users" (training) but almost never for "paid users" (non-training) 📊

December 23, 2024 at 7:16 PM

Vlad Bogolin

@vladbogo.bsky.social

1️⃣ Using Claude 3 Opus, researchers created scenarios with conflicting instructions - a system prompt for harmful queries vs prior training to refuse them

December 23, 2024 at 7:16 PM

Vlad Bogolin

@vladbogo.bsky.social

Paper: "Phi-4 Technical Report" by Abdin et al.
arXiv: https://arxiv.org/abs/2412.08905

Blog post: https://vladbogo.substack.com/p/phi-4-technical-report

December 14, 2024 at 9:46 PM

Vlad Bogolin

@vladbogo.bsky.social

Results:
🔹 Outperforms larger models on reasoning benchmarks
🔹 Excels in STEM-focused QA, surpassing GPT-4 on several tests
🔹 Achieves high performance with lower parameter count and inference costs

December 14, 2024 at 9:46 PM

Vlad Bogolin

@vladbogo.bsky.social

3️⃣ Post-training optimization with supervised fine-tuning and Direct Preference Optimization (DPO)
4️⃣ Introduction of "pivotal token search" for creating DPO pairs

December 14, 2024 at 9:46 PM

Vlad Bogolin

@vladbogo.bsky.social

Key points:

1️⃣ Synthetic data generation using multi-agent prompting, self-revision, and instruction reversal
2️⃣ Careful curation of organic data from high-quality sources

December 14, 2024 at 9:46 PM

Vlad Bogolin

@vladbogo.bsky.social

Paper: "Learning Flow Fields in Attention for Controllable Person Image Generation"

Read more: https://vladbogo.substack.com/p/learning-flow-fields-in-attention

Full paper: https://huggingface.co/papers/2412.08486

December 13, 2024 at 8:01 PM

Vlad Bogolin

@vladbogo.bsky.social

3️⃣ Leffa demonstrates better preservation of fine-grained details like textures and patterns compared to existing methods. 🔍

December 13, 2024 at 8:01 PM

Vlad Bogolin

@vladbogo.bsky.social

2️⃣ The method achieves state-of-the-art performance in virtual try-on and pose transfer tasks, with significant reductions in FID scores across datasets. 📊

December 13, 2024 at 8:01 PM

Vlad Bogolin

@vladbogo.bsky.social

1️⃣ Leffa uses flow fields in attention layers to guide the target query to attend to correct reference regions during training.

December 13, 2024 at 8:01 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news