Kathy
@kathaem.bsky.social
Computational Linguistics / Multilingual Language Models
Into SciFi, choir, cats (incomplete list of interests)
they/them
Into SciFi, choir, cats (incomplete list of interests)
they/them
@aclanthology.org not sure where to report, but in the last few months I've often had issues with long loading times/timeouts on aclanthology.org. It's particularly bad today---maybe related to the upcoming ARR deadline?
aclanthology.org
October 3, 2025 at 10:31 AM
@aclanthology.org not sure where to report, but in the last few months I've often had issues with long loading times/timeouts on aclanthology.org. It's particularly bad today---maybe related to the upcoming ARR deadline?
@aclrollingreview.bsky.social Why is the reviewing window (still) so short this cycle? Wasn't the cycle extended to ten weeks specifically to make the process more manageable? Wasn't it three weeks in past cycles? Instead reviewers don't even get two full weeks to handle 4+ submissions.
June 6, 2025 at 2:41 PM
@aclrollingreview.bsky.social Why is the reviewing window (still) so short this cycle? Wasn't the cycle extended to ten weeks specifically to make the process more manageable? Wasn't it three weeks in past cycles? Instead reviewers don't even get two full weeks to handle 4+ submissions.
Reposted by Kathy
TokShop @ #ICML2025 got way more submissions than expected! 📈 We could really use a few more reviewers to help out. If you have the capacity to review a #tokenization paper by Saturday, please fill out this form: forms.gle/32A6sQHQrMSb... 🙏
TokShop 2025
Registering interest in all things tokenization at TokShop @ ICML 2025 (July 18)
Consider joining the Google group for future updates!
https://groups.google.com/g/tokshop
forms.gle
June 2, 2025 at 4:40 PM
TokShop @ #ICML2025 got way more submissions than expected! 📈 We could really use a few more reviewers to help out. If you have the capacity to review a #tokenization paper by Saturday, please fill out this form: forms.gle/32A6sQHQrMSb... 🙏
Reposted by Kathy
Beyond text: Modern AI tokenizes images too! Vision models split photos into patches, treating each 16x16 pixel square as a "token." 🖼️➡️🔤 #VisualTokenization
Interested in tokenization? Join our workshop tokenization-workshop.github.io
The submission deadline is already May 30!
Interested in tokenization? Join our workshop tokenization-workshop.github.io
The submission deadline is already May 30!
tokenization-workshop.github.io
May 26, 2025 at 7:55 PM
Beyond text: Modern AI tokenizes images too! Vision models split photos into patches, treating each 16x16 pixel square as a "token." 🖼️➡️🔤 #VisualTokenization
Interested in tokenization? Join our workshop tokenization-workshop.github.io
The submission deadline is already May 30!
Interested in tokenization? Join our workshop tokenization-workshop.github.io
The submission deadline is already May 30!
I'll be presenting this paper in Gather Town (Session 1) in a few hours 🎊 Come along!
Happy to say that our paper "Beyond Literal Token Overlap: Token Alignability for Multilinguality" will be presented at #NAACL2025!
This is work with @tomlim.bsky.social, @jlibovicky.bsky.social, and Alex Fraser.
arxiv.org/abs/2502.06468
#newpaper #NLP #NLProc
This is work with @tomlim.bsky.social, @jlibovicky.bsky.social, and Alex Fraser.
arxiv.org/abs/2502.06468
#newpaper #NLP #NLProc
Beyond Literal Token Overlap: Token Alignability for Multilinguality
Previous work has considered token overlap, or even similarity of token distributions, as predictors for multilinguality and cross-lingual knowledge transfer in language models. However, these very li...
arxiv.org
May 6, 2025 at 1:37 PM
I'll be presenting this paper in Gather Town (Session 1) in a few hours 🎊 Come along!
Reposted by Kathy
This is a fantastic oral history of the last 10 years of NLP and AI. www.quantamagazine.org/when-chatgpt...
When ChatGPT Broke an Entire Field: An Oral History | Quanta Magazine
Researchers in “natural language processing” tried to tame human language. Then came the transformer.
www.quantamagazine.org
May 1, 2025 at 11:55 AM
This is a fantastic oral history of the last 10 years of NLP and AI. www.quantamagazine.org/when-chatgpt...
Just spent two days in Göttingen at #HumanCLAIM workshop! Re-presented my poster on surveying methods for cross-lingual representation alignment, got a city tour, heard cool talks and had interesting conversations 💬💭
March 27, 2025 at 3:04 PM
Just spent two days in Göttingen at #HumanCLAIM workshop! Re-presented my poster on surveying methods for cross-lingual representation alignment, got a city tour, heard cool talks and had interesting conversations 💬💭
Happy to say that our paper "Beyond Literal Token Overlap: Token Alignability for Multilinguality" will be presented at #NAACL2025!
This is work with @tomlim.bsky.social, @jlibovicky.bsky.social, and Alex Fraser.
arxiv.org/abs/2502.06468
#newpaper #NLP #NLProc
This is work with @tomlim.bsky.social, @jlibovicky.bsky.social, and Alex Fraser.
arxiv.org/abs/2502.06468
#newpaper #NLP #NLProc
Beyond Literal Token Overlap: Token Alignability for Multilinguality
Previous work has considered token overlap, or even similarity of token distributions, as predictors for multilinguality and cross-lingual knowledge transfer in language models. However, these very li...
arxiv.org
March 3, 2025 at 5:04 PM
Happy to say that our paper "Beyond Literal Token Overlap: Token Alignability for Multilinguality" will be presented at #NAACL2025!
This is work with @tomlim.bsky.social, @jlibovicky.bsky.social, and Alex Fraser.
arxiv.org/abs/2502.06468
#newpaper #NLP #NLProc
This is work with @tomlim.bsky.social, @jlibovicky.bsky.social, and Alex Fraser.
arxiv.org/abs/2502.06468
#newpaper #NLP #NLProc
Reposted by Kathy
Following the MT Marathon, we're hosting a hackathon in Prague. Researchers and students from five institutions (+1 online) are working together to assess how robust #LLMs are to grammar errors in machine translation and related tasks. Thanks to EAMT for their support.
February 27, 2025 at 4:07 PM
Following the MT Marathon, we're hosting a hackathon in Prague. Researchers and students from five institutions (+1 online) are working together to assess how robust #LLMs are to grammar errors in machine translation and related tasks. Thanks to EAMT for their support.
@queerinai.com Hi, I was invited to review for the workshop the other day but the email is not clear on when reviews will be due. This info will be important to decide if I'm able to serve; can you share the deadlines? Thanks!
February 19, 2025 at 12:15 PM
@queerinai.com Hi, I was invited to review for the workshop the other day but the email is not clear on when reviews will be due. This info will be important to decide if I'm able to serve; can you share the deadlines? Thanks!
Reposted by Kathy
Bill Labov died this morning. I'm not coherent enough to talk about how important and influential and brilliant he was. I am very sad.
I was so lucky to know him, and I am grateful every day that he (and Gillian, and Walt, etc) built an academic field where kindness is expected.
I was so lucky to know him, and I am grateful every day that he (and Gillian, and Walt, etc) built an academic field where kindness is expected.
December 18, 2024 at 2:08 AM
Bill Labov died this morning. I'm not coherent enough to talk about how important and influential and brilliant he was. I am very sad.
I was so lucky to know him, and I am grateful every day that he (and Gillian, and Walt, etc) built an academic field where kindness is expected.
I was so lucky to know him, and I am grateful every day that he (and Gillian, and Walt, etc) built an academic field where kindness is expected.
To add to the reviewing complaints 😅 Why do authors so often respond with an absolute wall of text? (Biggest response I got this time was four comments long.) As a reviewer, I find this very tough to engage with in the short discussion period, and as an author, I try to be concise in my responses.
November 25, 2024 at 10:35 AM
To add to the reviewing complaints 😅 Why do authors so often respond with an absolute wall of text? (Biggest response I got this time was four comments long.) As a reviewer, I find this very tough to engage with in the short discussion period, and as an author, I try to be concise in my responses.
Today I finally deactivated my Twitter account (not that I'd been super active there but hey) and decided to check out Bluesky. Looks like there's already a LOT of people here!
November 16, 2024 at 11:19 PM
Today I finally deactivated my Twitter account (not that I'd been super active there but hey) and decided to check out Bluesky. Looks like there's already a LOT of people here!