Kenneth Enevoldsen
@kennethenevoldsen.bsky.social
Postdoc at Aarhus University working on developing and evaluating representations of language and more
Maintain and develop: MTEB, ScandEval, tomsup, DaCy, etc.
#NLPProc
Maintain and develop: MTEB, ScandEval, tomsup, DaCy, etc.
#NLPProc
Love to see it!
March 25, 2025 at 10:27 AM
Love to see it!
New postdoc position at AarhusNLP, come join us!
The research includes efficient post-training, alignment, evaluation, and preference optimization, but we are very flexible for reinterpretation. So, if you think that you might be a partial fit do apply!
international.au.dk/about/profil...
The research includes efficient post-training, alignment, evaluation, and preference optimization, but we are very flexible for reinterpretation. So, if you think that you might be a partial fit do apply!
international.au.dk/about/profil...
Postdoctoral Positions in NLP Post-Training for Cultural Alignment and Preference Optimization - Vacancy at Aarhus University
Vacancy at School of Culture and Society - Center for Humanities Computing Aarhus, Aarhus University
international.au.dk
March 12, 2025 at 12:46 PM
New postdoc position at AarhusNLP, come join us!
The research includes efficient post-training, alignment, evaluation, and preference optimization, but we are very flexible for reinterpretation. So, if you think that you might be a partial fit do apply!
international.au.dk/about/profil...
The research includes efficient post-training, alignment, evaluation, and preference optimization, but we are very flexible for reinterpretation. So, if you think that you might be a partial fit do apply!
international.au.dk/about/profil...
Last week at #NoDaLiDa, we presented our work on 🇪🇺EuroEval, a large-scale benchmark for evaluating decoders and encoders.
The framework consists of 9 (🇬🇧🇫🇷🇩🇪🇳🇱🇸🇪🇩🇰🇳🇴🇮🇸🇫🇴) languages, with more to come, where each includes a language understanding and generation benchmark
March 11, 2025 at 10:07 AM
Last week at #NoDaLiDa, we presented our work on 🇪🇺EuroEval, a large-scale benchmark for evaluating decoders and encoders.
The framework consists of 9 (🇬🇧🇫🇷🇩🇪🇳🇱🇸🇪🇩🇰🇳🇴🇮🇸🇫🇴) languages, with more to come, where each includes a language understanding and generation benchmark
I am delighted to announce that we have released 🎊 MMTEB 🎊, a large-scale collaboration working on efficient multilingual evaluation of embedding models.
This work implements >500 evaluation tasks across >1000 languages and covers a wide range of use cases and domains🩺👩💻⚖️
This work implements >500 evaluation tasks across >1000 languages and covers a wide range of use cases and domains🩺👩💻⚖️
February 20, 2025 at 9:56 AM
I am delighted to announce that we have released 🎊 MMTEB 🎊, a large-scale collaboration working on efficient multilingual evaluation of embedding models.
This work implements >500 evaluation tasks across >1000 languages and covers a wide range of use cases and domains🩺👩💻⚖️
This work implements >500 evaluation tasks across >1000 languages and covers a wide range of use cases and domains🩺👩💻⚖️
Multilingual MTEB is soon to be released and with it, a shining new benchmark with a zero-shot filter! However, zero-shot is quite hard to define in a time of derivative models and synthetic data.
If you have an opinion on how it should define zero-shot, let us know:
github.com/embeddings-b...
If you have an opinion on how it should define zero-shot, let us know:
github.com/embeddings-b...
Defining zero-shot for MTEB · Issue #1760 · embeddings-benchmark/mteb
The next version of the MTEB leaderboard will soon be released and with it a new zero-shot filter. However, we are currently planning to use the following definition of zero-shot. This issue is to ...
github.com
January 11, 2025 at 12:22 PM
Multilingual MTEB is soon to be released and with it, a shining new benchmark with a zero-shot filter! However, zero-shot is quite hard to define in a time of derivative models and synthetic data.
If you have an opinion on how it should define zero-shot, let us know:
github.com/embeddings-b...
If you have an opinion on how it should define zero-shot, let us know:
github.com/embeddings-b...
New favourite razor
January 11, 2025 at 11:43 AM
New favourite razor
Reposted by Kenneth Enevoldsen
📣 Vacancy for Assistant Professor of Cognitive Science at Department of Linguistics, Cognitive Science and Semiotics, Aarhus University, Denmark. (Deadline January 6)
international.au.dk/about/profil...
international.au.dk/about/profil...
Assistant Professor of Cognitive Science at the School of Communication and Culture - Vacancy at Aarhus University
Vacancy at School of Communication and Culture - Linguistics, Cognitive Science and Semiotics, Dept. of, Aarhus University
international.au.dk
December 24, 2024 at 9:37 PM
📣 Vacancy for Assistant Professor of Cognitive Science at Department of Linguistics, Cognitive Science and Semiotics, Aarhus University, Denmark. (Deadline January 6)
international.au.dk/about/profil...
international.au.dk/about/profil...
Does anyone know of good methods for tracking dataset contaminations that doesn't rely on generation? Anything would be greatly appreciated
Asking as we would like to track and detect dataset contamination in MTEB:
github.com/embeddings-b...
Asking as we would like to track and detect dataset contamination in MTEB:
github.com/embeddings-b...
track and detect dataset contamination · Issue #1636 · embeddings-benchmark/mteb
In multiple threads tracking dataset contamination have been mentioned as multiple models do not share their training dataset. This issue is intended to link these discussions together as well as p...
github.com
December 27, 2024 at 6:59 PM
Does anyone know of good methods for tracking dataset contaminations that doesn't rely on generation? Anything would be greatly appreciated
Asking as we would like to track and detect dataset contamination in MTEB:
github.com/embeddings-b...
Asking as we would like to track and detect dataset contamination in MTEB:
github.com/embeddings-b...
Reposted by Kenneth Enevoldsen
Ugens afsnit af Verbos podcast er live med @kennethenevoldsen.bsky.social og Thomas Kobber Panum til en snak om AI's udvikling med udgangspunkt i Ilya Sutskevers talk til NeurIPS 🔥 #dkai #dktech
Lyt her 👇
YouTube: youtu.be/IpEla8mZHnU?...
Spotify: open.spotify.com/episode/41WT...
Lyt her 👇
YouTube: youtu.be/IpEla8mZHnU?...
Spotify: open.spotify.com/episode/41WT...
December 20, 2024 at 7:45 AM
Ugens afsnit af Verbos podcast er live med @kennethenevoldsen.bsky.social og Thomas Kobber Panum til en snak om AI's udvikling med udgangspunkt i Ilya Sutskevers talk til NeurIPS 🔥 #dkai #dktech
Lyt her 👇
YouTube: youtu.be/IpEla8mZHnU?...
Spotify: open.spotify.com/episode/41WT...
Lyt her 👇
YouTube: youtu.be/IpEla8mZHnU?...
Spotify: open.spotify.com/episode/41WT...
December 16, 2024 at 8:05 PM
Any good reason for using (machine) translated datasets/benchmarks for evaluating language models?
December 6, 2024 at 10:43 PM
Any good reason for using (machine) translated datasets/benchmarks for evaluating language models?
On my way to Neurips to present our work on the Scandinavian Embedding Benchmark (SEB) for evaluating embedding for retrieval, classification etc. for mainland Scandinavian languages.
Is there anything I should see while I am there?
Is there anything I should see while I am there?
December 5, 2024 at 11:43 AM
On my way to Neurips to present our work on the Scandinavian Embedding Benchmark (SEB) for evaluating embedding for retrieval, classification etc. for mainland Scandinavian languages.
Is there anything I should see while I am there?
Is there anything I should see while I am there?