Views my own, but affiliations that might influence them:
ML PhD Student under Prof. Diyi Yang
2x RS Intern🦙 Pretraining
Alum NYU Abu Dhabi
Burqueño
he/him
In my summer work on the Meta Llama team, we introduce UtiliMax and MEDU, new methods to estimate data utility and optimize data mixes efficiently.
HF Blog: huggingface.co/blog/WillHel...
ArXiv: arxiv.org/abs/2501.11747
Voice cloning is unfortunately a capability which inherently shows up in pretrained audio models. It would be great to be able to largely limit the capability at the level of model weights!
That's not great. So with @frimelle.bsky.social, we brainstormed a new idea for developers who want to curb malicious use: ✨The Voice Consent Gate.✨
Details, code, here: huggingface.co/blog/voice-c...
Voice cloning is unfortunately a capability which inherently shows up in pretrained audio models. It would be great to be able to largely limit the capability at the level of model weights!
cc: Sen. Bill Cassidy
If I were to guess, the attention sink is what allows them to omit QK-Norm which has become otherwise standard.
www.evanmiller.org/attention-is...
If I were to guess, the attention sink is what allows them to omit QK-Norm which has become otherwise standard.
www.evanmiller.org/attention-is...
Come see work from
@yanzhe.bsky.social,
@dorazhao.bsky.social @oshaikh.bsky.social,
@michaelryan207.bsky.social, and myself at any of the talks and posters below!
Come see work from
@yanzhe.bsky.social,
@dorazhao.bsky.social @oshaikh.bsky.social,
@michaelryan207.bsky.social, and myself at any of the talks and posters below!
My work is all presented tomorrow, but today you'll find me today at the poster session from 11-12:30 evangelizing
my labmate Yanzhe Zhang's work on his behalf.
If you're interested in the risks traditional pop-up attacks present for AI agents, come chat!
My work is all presented tomorrow, but today you'll find me today at the poster session from 11-12:30 evangelizing
my labmate Yanzhe Zhang's work on his behalf.
If you're interested in the risks traditional pop-up attacks present for AI agents, come chat!
I was curious, does AdamC just work?
So over the weekend, I ran 4 experiments—130M to 1.4B params—all at ~compute-optimal token counts...🧵
I was curious, does AdamC just work?
So over the weekend, I ran 4 experiments—130M to 1.4B params—all at ~compute-optimal token counts...🧵
From the details, you can @kyutai-labs.bsky.social is focused on real-world utility.
From the details, you can @kyutai-labs.bsky.social is focused on real-world utility.
Congratulations to all the authors of the three best papers and three honorable mention papers.
Be sure to check out their presentations at the conference next week!
facct-blog.github.io/2025-06-20/b...
Sen. Alex Padilla is then forcibly removed!
While AI R&D races to automate everything, we took a different approach: auditing what workers want vs. what AI can deliver across the US workforce.🧵
While AI R&D races to automate everything, we took a different approach: auditing what workers want vs. what AI can deliver across the US workforce.🧵
We Z-Lossed our way through the pain, but cool to see some stronger theory: marin.readthedocs.io/en/latest/re...
We Z-Lossed our way through the pain, but cool to see some stronger theory: marin.readthedocs.io/en/latest/re...
You need to be both sinophobic and irrational to expect the US to continue as the global scientific powerhouse with these policy own-goals.
You need to be both sinophobic and irrational to expect the US to continue as the global scientific powerhouse with these policy own-goals.
stanforddaily.com/2025/05/22/f...
x.com/percyliang/s...
x.com/percyliang/s...
Not just the final models/code/data, but also negative results, toy experiments, and even spontaneous discussions.
That's what we're trying @ marin.community
Not just the final models/code/data, but also negative results, toy experiments, and even spontaneous discussions.
That's what we're trying @ marin.community
I hope things like this are placebos, but if not we need to seriously consider whether existing peer-review processes for big ML conferences are providing value.
I hope things like this are placebos, but if not we need to seriously consider whether existing peer-review processes for big ML conferences are providing value.
A new benchmark for evaluating the capabilities required for speech-in-speech-out voice assistants!
- Latency
- Instruction following
- Function calling
- Tone awareness
- Turn taking
- Audio Safety
TalkArena.org/cava
A new benchmark for evaluating the capabilities required for speech-in-speech-out voice assistants!
- Latency
- Instruction following
- Function calling
- Tone awareness
- Turn taking
- Audio Safety
TalkArena.org/cava