Evals, metrics, multilinguality, multiculturality, multimodality, and (dabbling in) reasoning
https://saxon.me/
Are the equations supporting an argument or are they just a fancy way to express something simple? Do introduced terms do anything or get referenced anywhere?
I find the answer is usually no in the kinds of papers I review
Are the equations supporting an argument or are they just a fancy way to express something simple? Do introduced terms do anything or get referenced anywhere?
I find the answer is usually no in the kinds of papers I review
Interestingly, only for some multilingual models is this true. Aya knows China best in Chinese, but LLaMA's best in English always.
Interestingly, only for some multilingual models is this true. Aya knows China best in Chinese, but LLaMA's best in English always.
Turning the replies to a bluesky post into the comment section for a blogpost is a small concrete way to support the ecosystem: future visitors who want to add comments incentivized to interact on the platform
Also, it's very easy to do:
Turning the replies to a bluesky post into the comment section for a blogpost is a small concrete way to support the ecosystem: future visitors who want to add comments incentivized to interact on the platform
Also, it's very easy to do:
Also, I am getting more and more indiewebpilled. Would any other NLPMLAI researcher-bloggers be interested in making a webring?
Also, I am getting more and more indiewebpilled. Would any other NLPMLAI researcher-bloggers be interested in making a webring?
(Canivez and Youngstrom, 2019) and (Wasserman, 2019) do exist. Problem is they have different titles and are in different journals.
Don't generate your references folks!
(Canivez and Youngstrom, 2019) and (Wasserman, 2019) do exist. Problem is they have different titles and are in different journals.
Don't generate your references folks!
Proof: different articles present at the specified journal/volume/page number, and their titles exist nowhere on any searchable repository.
Take this as a warning to not use LMs to generate your references!
Proof: different articles present at the specified journal/volume/page number, and their titles exist nowhere on any searchable repository.
Take this as a warning to not use LMs to generate your references!
AIRe can be used to grade the "stylistic aspects" of a fantasy entity, not just match real stuff 4/5
AIRe can be used to grade the "stylistic aspects" of a fantasy entity, not just match real stuff 4/5
3/5
3/5
BITS undergrads Siddharth and Arnav Yayavaram, @simi97k.bsky.social, @gneubig.bsky.social, and I made one.1/
BITS undergrads Siddharth and Arnav Yayavaram, @simi97k.bsky.social, @gneubig.bsky.social, and I made one.1/
So many rightoid maniacs query it expecting to see their conspiracist beliefs echoed back at them only to repeatedly get gently corrected with factual information lmao
So many rightoid maniacs query it expecting to see their conspiracist beliefs echoed back at them only to repeatedly get gently corrected with factual information lmao
Most interestingly, our model-predicted deadlines find the OPTIMAL budget, near the plateau where further spend isn't beneficial
In this way Terminator is a tool any RM can use!
Most interestingly, our model-predicted deadlines find the OPTIMAL budget, near the plateau where further spend isn't beneficial
In this way Terminator is a tool any RM can use!
This way we can get a more comprehensive view of overthinking, from the hardest GPQA and ZebraLogic Qs to literally "2+2=?"
This way we can get a more comprehensive view of overthinking, from the hardest GPQA and ZebraLogic Qs to literally "2+2=?"