π https://mdhk.net/
π https://scholar.social/@mdhk
π¦ https://twitter.com/mariannedhk
We took inspiration from classic phonetic categorization experiments to explore where sensitivity to phonotactic context emerges in Wav2Vec2 models π
(w/ @wzuidema.bsky.social)
π arxiv.org/abs/2407.03005
β¬οΈ
www.eventbrite.com/e/4th-dutch-...
www.eventbrite.com/e/4th-dutch-...
The textual basis of current LLMs causes trouble, but linguistically relevant insights *can* be found in systems modelling the more natural form of human spoken language: the speech signal itself. arxiv.org/abs/2512.14506
The textual basis of current LLMs causes trouble, but linguistically relevant insights *can* be found in systems modelling the more natural form of human spoken language: the speech signal itself. arxiv.org/abs/2512.14506
This is open access; MIT Press will post a link soon, but until then, the book is available on my website:
tedlab.mit.edu/tedlab_websi...
This is open access; MIT Press will post a link soon, but until then, the book is available on my website:
tedlab.mit.edu/tedlab_websi...
@sashakenjeeva.bsky.social openreview.net/forum?id=Vtd...
github.com/markvandenho... openreview.net/forum?id=rX3...
@nina-nusbaumer.bsky.social openreview.net/forum?id=GRz...
www.ru.nl/personen/sui... openreview.net/forum?id=NcJ...
Yesterday MSc student Sven Terpstra (co-supervised w/ @wzuidema.bsky.social) presented his project on predicting the N400 with GPT-derived metrics beyond surprisal openreview.net/forum?id=MAl...
Yesterday MSc student Sven Terpstra (co-supervised w/ @wzuidema.bsky.social) presented his project on predicting the N400 with GPT-derived metrics beyond surprisal openreview.net/forum?id=MAl...
The textual basis of current LLMs causes trouble, but linguistically relevant insights *can* be found in systems modelling the more natural form of human spoken language: the speech signal itself. arxiv.org/abs/2512.14506
The textual basis of current LLMs causes trouble, but linguistically relevant insights *can* be found in systems modelling the more natural form of human spoken language: the speech signal itself. arxiv.org/abs/2512.14506
A short thread about my new paper in @cadlin.bsky.social
This work has the most original insight I've ever had, a genuinely new idea about the nature of language
cadernos.abralin.org/index.php/ca...
1/20
A short thread about my new paper in @cadlin.bsky.social
This work has the most original insight I've ever had, a genuinely new idea about the nature of language
cadernos.abralin.org/index.php/ca...
1/20
A π§΅ of takeaways from our paper doi.org/10.1007/s421... with @andreaeyleen.bsky.social
A π§΅ of takeaways from our paper doi.org/10.1007/s421... with @andreaeyleen.bsky.social
The 'Design Features' of Language Revisited (w/ @mperlman.bsky.social @glupyan.bsky.social Koen de Reus & @limorraviv.bsky.social)
Feature Review out now in #OpenAccess in @cp-trendscognsci.bsky.social! #language #linguistics
Paper: doi.org/10.1016/j.ti...
The 'Design Features' of Language Revisited (w/ @mperlman.bsky.social @glupyan.bsky.social Koen de Reus & @limorraviv.bsky.social)
Feature Review out now in #OpenAccess in @cp-trendscognsci.bsky.social! #language #linguistics
Paper: doi.org/10.1016/j.ti...
"Hierarchical dynamic coding coordinates speech comprehension in the brain"
with dream team @alecmarantz.bsky.social, @davidpoeppel.bsky.social, @jeanremiking.bsky.social
Summary π
1/8
"Hierarchical dynamic coding coordinates speech comprehension in the brain"
with dream team @alecmarantz.bsky.social, @davidpoeppel.bsky.social, @jeanremiking.bsky.social
Summary π
1/8
LLMs learn from vastly more data than humans ever experience. BabyLM challenges this paradigm by focusing on developmentally plausible data
We extend this effort to 45 new languages!
LLMs learn from vastly more data than humans ever experience. BabyLM challenges this paradigm by focusing on developmentally plausible data
We extend this effort to 45 new languages!
They show that LLMs implicitly apply an internal low-rank weight update adjusted by the context. Itβs cheap (due to the low-rank) but effective for adapting the modelβs behavior.
#MLSky
arxiv.org/abs/2507.16003
They show that LLMs implicitly apply an internal low-rank weight update adjusted by the context. Itβs cheap (due to the low-rank) but effective for adapting the modelβs behavior.
#MLSky
arxiv.org/abs/2507.16003
Come work with Mirjam Broersma, @davidpeeters.bsky.social, and me at the Centre for Language Studies, Radboud University in the Netherlands.
Application deadline: 19 October 2025
For more information, see
www.ru.nl/en/working-a...
Come work with Mirjam Broersma, @davidpeeters.bsky.social, and me at the Centre for Language Studies, Radboud University in the Netherlands.
Application deadline: 19 October 2025
For more information, see
www.ru.nl/en/working-a...
My tutorial on speech analysis tools in Python from the Unboxing Multimodality summer school (github.com/mdhk/unboxin...) is now also available at envisionbox.org
Thanks for the invitation & this great initiative! π
@babajideowoyele.bsky.social @jamestrujillo.bsky.social @sarkadava.bsky.social @DavideAhmar @acwiek.bsky.social
Amazing Markus KΓΌpper made an animated video:
www.youtube.com/watch?v=HduI...
My tutorial on speech analysis tools in Python from the Unboxing Multimodality summer school (github.com/mdhk/unboxin...) is now also available at envisionbox.org
Thanks for the invitation & this great initiative! π
A week has already flown by since I had one of the most formative experiences of my PhD so far. π©βπ¨
A week has already flown by since I had one of the most formative experiences of my PhD so far. π©βπ¨
And to @itcooperativesurf.bsky.social (EINF-8324) for granting me the resources that enabled this project π©βπ»β¨
And to @itcooperativesurf.bsky.social (EINF-8324) for granting me the resources that enabled this project π©βπ»β¨
π arxiv.org/abs/2506.00981
Or the model, dataset and code released alongside it:
π€ huggingface.co/amsterdamNLP...
ποΈ zenodo.org/records/1554...
π github.com/mdhk/SSL-NL-...
We hope these resources help further research on language-specificity in speech models!
π arxiv.org/abs/2506.00981
Or the model, dataset and code released alongside it:
π€ huggingface.co/amsterdamNLP...
ποΈ zenodo.org/records/1554...
π github.com/mdhk/SSL-NL-...
We hope these resources help further research on language-specificity in speech models!