Lightnews — Scholar-powered news

Reposted by Siva Reddy

Gaurav Kamath @grvkamath.bsky.social · Jul 29

Our new paper in #PNAS (bit.ly/4fcWfma) presents a surprising finding—when words change meaning, older speakers rapidly adopt the new usage; inter-generational differences are often minor.

w/ Michelle Yang, ‪@sivareddyg.bsky.social‬ , @msonderegger.bsky.social‬ and @dallascard.bsky.social‬👇(1/12)

3 17 33

Siva Reddy @sivareddyg.bsky.social · Jul 29

Age doesn't matter to pick up new word usages. The pronunciation may sound odd across generations but not the semantics 👴👵👨👩

Dallas Card @dallascard.bsky.social · Jul 29

I am delighted to share our new #PNAS paper, with @grvkamath.bsky.social @msonderegger.bsky.social and @sivareddyg.bsky.social, on whether age matters for the adoption of new meanings. That is, as words change meaning, does the rate of adoption vary across generations? www.pnas.org/doi/epdf/10....

5

Reposted by Siva Reddy

VLMs4All - CVPR 2025 Workshop @vlms4all.bsky.social · Jun 6

🗓️ Save the date! It's official: The VLMs4All Workshop at #CVPR2025 will be held on June 12th!

Get ready for a full day of speakers, posters, and a panel discussion on making VLMs more geo-diverse and culturally aware 🌐

Check out the schedule below!

3 4

Siva Reddy @sivareddyg.bsky.social · May 1

The paper will be presented today orally at 4:30--4:45.

Read the paper here: arxiv.org/abs/2502.05670

Language Models Largely Exhibit Human-like Constituent Ordering Preferences

Though English sentences are typically inflexible vis-à-vis word order, constituents often show far more variability in ordering. One prominent theory presents the notion that constituent ordering is ...

arxiv.org

1

Siva Reddy @sivareddyg.bsky.social · May 1

Ada is an undergrad and will soon be looking for PhDs. Gaurav is a PhD student looking for intellectually stimulating internships/visiting positions. They did most of the work without much of my help. Highly recommend them. Please reach out to them if you have any positions.

Language Models Largely Exhibit Human-like Constituent Ordering Preferences

Though English sentences are typically inflexible vis-à-vis word order, constituents often show far more variability in ordering. One prominent theory presents the notion that constituent ordering is ...

arxiv.org

1 2 6

Siva Reddy @sivareddyg.bsky.social · May 1

Humans have a tendency to move heavier constituents to the end of the sentence. While LLMs show similar behaviour, what's surprising is that pretrianed models behave closer to humans than instruction-tuned models. And syllables rather than tokens define a better metric to define the heaviness.

1 1

Siva Reddy @sivareddyg.bsky.social · May 1

Incredibly proud of my students @adadtur.bsky.social and Gaurav Kamath for winning a SAC award at #NAACL2025 for their work on assessing how LLMs model constituent shifts.

1 5 17

Reposted by Siva Reddy

Benno Krojer @bennokrojer.bsky.social · May 1

Great work from labmates on LLMs vs humans regarding linguistic preferences: You know when a sentence kind of feels off e.g. "I met at the park the man". So in what ways do LLMs follow these human intuitions?

Mila - Institut québécois d'IA @mila-quebec.bsky.social · May 1

Congratulations to Mila members @adadtur.bsky.social , Gaurav Kamath and @sivareddyg.bsky.social for their SAC award at NAACL! Check out Ada's talk in Session I: Oral/Poster 6. Paper: arxiv.org/abs/2502.05670

3 7

Siva Reddy @sivareddyg.bsky.social · Apr 14

List of #SafetyGuaranteedLLMs talks on Monday Apr 14 2025 PDT. Speakers @rogergrosse.bsky.social Boaz Barak, Ethan Perez, Georgios Piliouras

4

Siva Reddy @sivareddyg.bsky.social · Apr 14

The most exciting event on LLM safety is happening this week at @simonsinstitute.bsky.social with many excellent speakers. Organized by @yoshuabengio.bsky.social et al. Join us in person or virtual. In collaboration with @ivado.bsky.social. More details here:

simons.berkeley.edu/workshops/sa...

2 7

Reposted by Siva Reddy

Simons Institute for the Theory of Computing @simonsinstitute.bsky.social · Apr 11

Though in-person registration is now full, you can still register to view the private livestream for next week's workshop on Safety-Guaranteed LLMs, co-organized with @ivado.bsky.social. We'll be posting live here as well.

simons.berkeley.edu/workshops/sa...

2 4

Siva Reddy @sivareddyg.bsky.social · Apr 3

sorry to hear but please don't boycott us. We are having a tough time with US already :). I hate the new system too. Earlier it was just a pdf. You can just send the report to the supervisor with pass/fail and feedback and perhaps they can take care from there.

1 1

Reposted by Siva Reddy

Benno Krojer @bennokrojer.bsky.social · Apr 1

Never been part of a project like this before - it was a very rewarding+unique experience!

Everyone in the lab contributed different chapters and it was much more exploratory than your average phd project.

My chapter studied R1's reasoning on "image generation/editing" (via ASCII) 🧵👇

1/N

Sara Vera Marjanovic @saravera.bsky.social · Apr 1

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
🔗: mcgill-nlp.github.io/thoughtology/

A circular diagram with a blue whale icon at the center. The diagram shows 8 interconnected research areas around LLM reasoning represented as colored rectangular boxes arranged in a circular pattern. The areas include: §3 Analysis of Reasoning Chains (central cloud), §4 Scaling of Thoughts (discussing thought length and performance metrics), §5 Long Context Evaluation (focusing on information recall), §6 Faithfulness to Context (examining question answering accuracy), §7 Safety Evaluation (assessing harmful content generation and jailbreak resistance), §8 Language & Culture (exploring moral reasoning and language effects), §9 Relation to Human Processing (comparing cognitive processes), §10 Visual Reasoning (covering ASCII generation capabilities), and §11 Following Token Budget (investigating direct prompting techniques). Arrows connect the sections in a clockwise flow, suggesting an iterative research methodology.

1 2 13

Siva Reddy @sivareddyg.bsky.social · Apr 1

I will be giving a talk about this work @SimonsInstitute tomorrow (Apr 2nd 3PM PT). Join us, both in-person or virtually.

simons.berkeley.edu/workshops/fu...

6

Siva Reddy @sivareddyg.bsky.social · Apr 1

Introducing the DeepSeek-R1 Thoughtology -- the most comprehensive study of R1 reasoning chains/thoughts ✨. Probably everything you need to know about R1 thoughts. If we missed something, please let us know.

Sara Vera Marjanovic @saravera.bsky.social · Apr 1

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
🔗: mcgill-nlp.github.io/thoughtology/

4 17

Reposted by Siva Reddy

Conference on Language Modeling @colmweb.org · Mar 20

A bit of a mess around the conflict of COLM with the ARR (and to lesser degree ICML) reviews release. We feel this is creating a lot of pressure and uncertainty. So, we are pushing our deadlines:

Abstracts due March 22 AoE (+48hr)
Full papers due March 28 AoE (+24hr)

Plz RT 🙏

3 31 37

Reposted by Siva Reddy

Benno Krojer @bennokrojer.bsky.social · Mar 18

As someone who has tried to make even basic image editing work in my research (e.g. "move cup to left of table"):
Gemini's new editing capabilities are seriously impressive!

Playing around with it is quite fun...
Edit 1: "edit the image to contain 3 more people"

3 1 9

Siva Reddy @sivareddyg.bsky.social · Mar 4

Why do LLMs have a hard time aligning, while humans are better at it? 🌟The answer lies in the lack of a societal alignment framework for LLMs 🌍.

Incredible effort by @karstanczak.bsky.social in pulling views from multiple disciplines and experts in these fields.

arxiv.org/abs/2503.00069

Karolina Stańczak @karstanczak.bsky.social · Mar 4

📢New Paper Alert!🚀

Human alignment balances social expectations, economic incentives, and legal frameworks. What if LLM alignment worked the same way?🤔

Our latest work explores how social, economic, and contractual alignment can address incomplete contracts in LLM alignment🧵

7

Siva Reddy @sivareddyg.bsky.social · Feb 21

How to Get Your LLM to Generate Challenging
Problems for Evaluation? 🤔 Check out our CHASE recipe. A highly relevant problem given that most human-curated datasets are crushed within days.

Arkil Patel @arkil.bsky.social · Feb 21

Presenting ✨ 𝐂𝐇𝐀𝐒𝐄: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠 𝐬𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐝𝐚𝐭𝐚 𝐟𝐨𝐫 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 ✨

Work w/ fantastic advisors Dima Bahdanau and @sivareddyg.bsky.social

Thread 🧵:

2 4

Reposted by Siva Reddy

Benno Krojer @bennokrojer.bsky.social · Dec 8

Finally it's handy that all my twitter posts got migrated here to bsky:

I'll be presenting AURORA at @neuripsconf.bsky.social on Wednesday!

Come by to discuss text-guided editing (and why imo it is more interesting than image generation), world modeling, evals and vision-and-language reasoning

Benno Krojer @bennokrojer.bsky.social · Sep 26

AURORA 🌌 is now accepted as a Spotlight at NeurIPS 🥂

We wondered if a model can do *controlled* video generation but in a *single* step?

So we built a dataset+model for “taking actions” on images via editing, or what you could call single-step controlled video gen

Benno Krojer @bennokrojer.bsky.social · Jul 9

Did you miss the recent Auroras? No problem! ✨🎆

Super excited to share AURORA, a *general* image editing model + high-quality data that improves where prev work fails the most:
Performing *action or movement* edits, i.e. a kind of world model setup

Insights/Details ⬇️

1 2 24

Siva Reddy @sivareddyg.bsky.social · Nov 29

Congratulations
@andreasmadsen.bsky.social
on successfully defending your PhD ⚔️ 🎉🎉 Grateful to you for stretching my interests into interpretability and engaging me with exciting deas. Good luck with your mission on building faithfully interpretable models.

Andreas Madsen @andreasmadsen.bsky.social · Nov 28

I’m thrilled to share that I’ve finished my Ph.D. at Mila and Polytechnique Montreal. For the last 4.5 years, I have worked on creating new faithfulness-centric paradigms for NLP Interpretability. Read my vision for the future of interpretability in our new position paper: arxiv.org/abs/2405.05386

Interpretability Needs a New Paradigm

Interpretability is the study of explaining models in understandable terms to humans. At present, interpretability is divided into two paradigms: the intrinsic paradigm, which believes that only model...

arxiv.org

9

Reposted by Siva Reddy

Apoorv Khandelwal @apoorvkh.com · Nov 26

“Turn” a decoder into an encoder with LLM2Vec (github.com/McGill-NLP/l...). Seen at COLM 2024 :)

If you want the naive, training-free / model-agnostic approach: their related work section says it is most common to using the final token’s last hidden state.

GitHub - McGill-NLP/llm2vec: Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders' - McGill-NLP/llm2vec

github.com

1 14

Siva Reddy @sivareddyg.bsky.social · Nov 26

Stages of #ICLR reviewing:
Stage 1: 😍 I hope I learn something new
Stage 2: 🤗 I hope I am constructive enough while being critical. Submits review
Stage 3: 🤯 Receives 5 page response + revision with many new pages
Stage 4: 😱 Crap, how do I get out of this?
Stage 5: 😵‍💫 What year is it?

17

Reposted by Siva Reddy

Ofir Press @ofirpress.bsky.social · Nov 25

I wrote some thoughts on how to build good LM benchmarks: ofir.io/How-to-Build...

How to Build Good Language Modeling Benchmarks

Building benchmarks is important because they shine a spotlight on the weaknesses of existing language models and so can guide the community on how to improve them.

ofir.io

5 8 77

Reposted by Siva Reddy

Niclas Overby Ⓝ @overby.me · Nov 24

@sivareddyg.bsky.social Which platforms? Maybe consider @buffer.com

1 1