Qing Yao
qyao.bsky.social
Qing Yao
@qyao.bsky.social
Linguistics PhD student at UT Austin
LMs’ dative alternation preferences come from both direct evidence and more general properties of language. They don’t just memorize–they generalize! See the paper for details on animacy too (interestingly more complicated!)
March 31, 2025 at 1:30 PM
Learned length preference changes with the input manipulation. That is, the more “long-first” we make the input, the weaker the short-first preference. We think this shows the dative preferences in models come not just from datives but from general properties of English.
March 31, 2025 at 1:30 PM
For example, “The primates use tools to eat the green coconuts from the shop” becomes:
-Short-first: [tools] use [the primates] [[to] eat [[the] [green] coconuts [from the shop]]]
-Long-first: [[[from the shop] [the] coconuts [green]] eat [to]] use [the primates] [tools]
March 31, 2025 at 1:30 PM
We think it plausibly comes not from the datives alone but from general properties of English (which is “short-first”). To test that, we manipulate the global structure of the input, creating a corpus where every sentence is short-first and one where they’re all long-first.
March 31, 2025 at 1:30 PM
Now what if we get rid of datives, and further all constructions which have two postverbal arguments? Now we see the length preference is back again. Yes it’s smaller (direct evidence matters), but why is it there? Where does it come from if not the datives?
March 31, 2025 at 1:30 PM
What if we modify the corpus such that for every DO there is a PO (balance direct evidence)? The preferences are still present! But what if now we SWAP every dative in the input so that every DO is now a PO, every PO a DO? The preference essentially disappears (but not flipped!)
March 31, 2025 at 1:30 PM
To test this, we train small LMs on manipulated datasets where we vary direct (datives) and indirect (non-datives) evidence and test the change in their preferences. First, we see that we get human-like preferences on a model trained on our default BabyLM corpus.
March 31, 2025 at 1:30 PM
The English dative preferences come from more general features of the language: short constituents tend to appear earlier all over, not just in the dative. We hypothesize LMs rely on direct evidence from datives but also general word order preferences (e.g. “easy first”) from non-datives.
March 31, 2025 at 1:30 PM
For example, “The primates use tools to eat the green coconuts from the shop” becomes:
- Short-first: [tools] use [the primates] [[to] eat [[the] [green] coconuts [from the shop]]]
- Long-first: [[[from the shop] [the] coconuts [green]] eat [to]] use [the primates] [tools]
March 31, 2025 at 1:14 PM
We think it plausibly comes not from the datives alone but from general properties of English (which is “short-first”). To test that, we manipulate the global structure of the input, creating a corpus where every sentence is short-first and one where they’re all long-first.
March 31, 2025 at 1:14 PM
Now what if we get rid of datives, and further all constructions which have two postverbal arguments? Now we see the length preference is back again. Yes it’s smaller (direct evidence matters), but why is it there? Where does it come from if not the datives?
March 31, 2025 at 1:14 PM
What if we modify the corpus such that for every DO there is a PO (balance direct evidence)? The preferences are still present! But what if now we SWAP every dative in the input so that every DO is now a PO, every PO a DO? The preference essentially disappears (but not flipped!)
March 31, 2025 at 1:14 PM
To test this, we train small LMs on manipulated datasets where we vary direct (datives) and indirect (non-datives) evidence and test the change in their preferences. First, we see that we get human-like preferences on a model trained on our default BabyLM corpus.
March 31, 2025 at 1:14 PM
The English dative preferences come from more general features of the language: short constituents tend to appear earlier all over, not just in the dative. We hypothesize LMs rely on direct evidence from datives but also general word order preferences (e.g. “easy first”) from non-datives.
March 31, 2025 at 1:14 PM