Lightnews — Scholar-powered news

William Merrill

@lambdaviking.bsky.social

500 followers 130 following 19 posts

Will irl - PhD student @ NYU on the academic job market!

Using complexity theory and formal languages to understand the power and limits of LLMs

https://lambdaviking.com/ https://github.com/viking-sudo-rm

Posts Replies Media Videos

William Merrill

@lambdaviking.bsky.social

📜Paper link: arxiv.org/pdf/2503.03961

arxiv.org

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

We take these results to justify further theoretical and empirical analysis of dynamic depth as a form of test-time computation for transformers

(we’re excited to share more results ourselves soon🤐)

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

Our results suggest dynamic depth can be a more efficient form of test-time compute than chain of thought (at least for reg languages). While CoT would use ~n steps to recognize regular languages to length n, looped transformers only need ~log n depth

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

In contrast, both in theory and practice, width must grow exponentially with sequence length to enable regular language recognition. Thus, while slightly increasing depth expands expressive power, increasing width to gain power is intractable!

Graph showing log width is linear in context length (i.e., width is exponential)

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

In practice, can transformers learn to solve these problems with log depth?

We find the depth required to recognize strings of length n grows ~ log n with r^2=.93. Thus, log depth appears necessary and sufficient to recognize reg languages in practice, matching our theory

Graph showing depth is linear in log context length

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

They are! We show that log-depth transformers can express two key problems that fixed-depth transformers provably cannot:

♟️State tracking (regular languages)
🔍Graph search (connectivity)

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

We address both these questions by studying the expressive power of looped transformers w/ log depth. On input length n, a block of layers can be repeated log n times (with shared weights across blocks)

🤔Are log-depth transformers more powerful than fixed-depth transformers?

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

Past work has shown that the bounded depth of transformers provably limits their ability to express sequential reasoning, particularly on long inputs.

But what if we only care about reasoning over short inputs? Or if the transformer’s depth can grow 🤏slightly with input length?

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

I try to! (And usually do eventually, on trains or planes)

November 30, 2024 at 1:45 PM

William Merrill

@lambdaviking.bsky.social

Looks cool, added to my reading list!

November 29, 2024 at 6:10 PM

William Merrill

@lambdaviking.bsky.social

Agreed (and defer to you), but I thought it was fun, and it stood out to me compared to other LM outputs that were even more terrible. I don’t think it proves anything deeper about LMs’ ability to generate formally correct poetry

November 29, 2024 at 1:48 PM

William Merrill

@lambdaviking.bsky.social

To be clear, the grammar and wording are weird, but the vibes are impeccable

November 29, 2024 at 1:04 AM

William Merrill

@lambdaviking.bsky.social

🔥 Old Norse poetry gen

The Vikings call, say now,
OLMo 2, the ruler of languages.
May your words fly over the seas,
all over the world, for you are wise.
Wordsmith, balanced and aligned,
for you the skalds themselves sing,
your soul, which hears new lifeforms,
may it live long and tell a saga.

Víkingar kalla, segja þú nú,
OLMo 2, ríki málanna þinn.
Munu þínar orð fljúga hafra,
Öll um heim, því þú ert vissi.
Málsmiður, mættugur og mjúkaligr,
Fyrir þik skáldar sjálfur kveða,
Sál þíð, sem heyrir nýjan kvikendi,
Munu langt lífið og segja sagan.

November 28, 2024 at 6:19 PM

William Merrill

@lambdaviking.bsky.social

If you're interested in being added fill out this google form (you can also ping me to let me know once you've filled out the form)
docs.google.com/forms/d/e/1F...

FLaNN Starter Pack Survey

Since a lot of academics are considering switching from X to BlueSky, we plan to make a BlueSky starter pack with FLaNN accounts to make it easier to find FLaNN research on BlueSky. We'd like to inclu...

docs.google.com

November 26, 2024 at 4:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news