Lightnews — Scholar-powered news

Sam Boeve

@boevesam.bsky.social

57 followers 64 following 16 posts

Posts Replies Media Videos

Sam Boeve

@boevesam.bsky.social

Want to explore word predictability yourself on a sample of each corpus used in this work, check out this app:

wordpredictabilityvisualized.vercel.app

Word Predictability Visualization App

wordpredictabilityvisualized.vercel.app

September 2, 2025 at 7:27 AM

Sam Boeve

@boevesam.bsky.social

Modelling reading times in Dutch?:

gpt2-small-dutch (huggingface.co/GroNLP/gpt2-...) or gpt2-medium-dutch-embeddings (huggingface.co/GroNLP/gpt2-...) are great options.

GroNLP/gpt2-small-dutch · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

September 2, 2025 at 7:27 AM

Sam Boeve

@boevesam.bsky.social

3. Predictability effects are also logarithmic in Dutch, corroborating effects found in English (= linear effect of surprisal):

For very unpredictable words, a decrease in predictability has a much larger slowing-down effect on reading times than the same decrease for highly predictable words.

September 2, 2025 at 7:27 AM

Sam Boeve

@boevesam.bsky.social

2. Language-specific models are generally better than multilingual ones (multilingual models are shown in blue in the figure below).

September 2, 2025 at 7:27 AM

Sam Boeve

@boevesam.bsky.social

Key findings 📝

1. Smaller Dutch models often predict reading times better (= inverse scaling trend) ~ in line with evidence of English models.

But, with more context (in a book reading corpus), larger models catch up.

September 2, 2025 at 7:27 AM

Sam Boeve

@boevesam.bsky.social

Large language models are powerful tools for psycholinguistic research.

But, most evidence so far is limited to English.

How well do Dutch open-source language models fit reading times using their word predictability estimates?

September 2, 2025 at 7:27 AM

Sam Boeve

@boevesam.bsky.social

Overall, our results provide a psychometric leaderboard of Dutch large language models, ideal for researchers interested in effects of predictability in Dutch.

Check out our full dataset and code here:
osf.io/wr4qf/

A Systematic Evaluation of Dutch Large Language Models’ Surprisal Estimates in Sentence, Paragraph, and Book Reading

A psychometric evaluation of Dutch large language models. Hosted on the Open Science Framework

osf.io

December 19, 2024 at 4:12 PM

Sam Boeve

@boevesam.bsky.social

Finally, we found a linear link between surprisal and reading times except for the GECO corpus where a non-linear link between surprisal and reading times fitted the data best.

A challenge to the notion of an universal linear effect of surprisal.

December 19, 2024 at 4:12 PM

Sam Boeve

@boevesam.bsky.social

Second, smaller Dutch models showed a better fit to reading times than the largest models, replicating the inverse scaling trend seen in English.
However, this effect varied depending on the corpus used.

December 19, 2024 at 4:12 PM

Sam Boeve

@boevesam.bsky.social

First, across three eye-tracking corpora, we found that in each case, a Dutch LLMs' surprisal estimates outperformed the multilingual model (mGPT) and the N-gram model in predicting reading times.

December 19, 2024 at 4:12 PM

Sam Boeve

@boevesam.bsky.social

3.

Does surprisal still show linear link with reading times when estimated with a Dutch-specific language model as opposed to a multilingual model?

December 19, 2024 at 4:12 PM

Sam Boeve

@boevesam.bsky.social

2.

Do these Dutch-specific LLMs show a similar inverse scaling trend as English models?

That is, do the smaller transformer models' surprisal estimates account better for reading times than those of the very large models?

December 19, 2024 at 4:12 PM

Sam Boeve

@boevesam.bsky.social

1.

What is the best computational method for estimating word predictability in Dutch?

We compare 14 Dutch large language models (LLMs), a multilingual model (mGPT) and an N-gram model in their ability of explaining reading times.

December 19, 2024 at 4:12 PM

Sam Boeve

@boevesam.bsky.social

The effect of word predictability on reading times is well established for English but not so much for Dutch.

We adressed this and asked three questions:

December 19, 2024 at 4:12 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news